How to Generate Stunning Art on Your Laptop using AI



Original Source Here

How to Generate Stunning Art on Your Laptop using AI

A robot with a painter’s hat painting a picture of a mountain on a white canvas. Digital art. — Image generated by Stable Diffusion

As a kid, I always admired people that could draw whatever came to their mind. I could watch them for hours as they would give shape to seemingly arbitrary lines on paper. Unfortunately, I was not blessed with that gift.

Time went by, and today AI can help me materialize the ideas I have in my head. It’s not the same thing, and the process is not even remotely as satisfying. but it’s a way I can express my thoughts on paper.

I was really excited when I was granted access to OpenAI’s DALL·E 2 private beta. However, the downside was the limitations of what I could do with it, how many times I could use it, the price tag, and the idea that I don’t control the software in any way.

Then, Stable Diffusion came to the party. Much like DALL·E 2, Stable Diffusion is an open-source AI system that creates realistic images and art from a description in natural language. It combines concepts, attributes, and styles to create unique images or device variations of existing pictures. And the good news is that you can run it on your laptop today.

This story shows how you can create incredible art on your system in under 10 minutes, even if you do not have a GPU device. I will guide you through the setup process; you only have to have access to a machine that runs Python and has Git installed. In the end, you’ll be able to communicate to the model what you want to draw, in natural language, sit back and reap the rewards.

What is Stable Diffusion

As we said, Stable Diffusion is a Latent Diffusion Model (LDM) that creates realistic images and art from a description in natural language. It was developed by Stability AI and LMU Munich, with support from communities at Eleuther AI and LAION.

Stable Diffusion draws inspiration from projects like DALL·E 2 by Open AI or Imagen by Google Brain, and it was trained on LAION-Aesthetics, a soon-to-be-released subset of LAION 5B.

Stable Diffusion is available as an open-source project on GitHub, and it will empower billions of people to create stunning art because of its very permissive license.

Setting Up Your Environment

Now that we have some background for the project, let’s put it to work. The first step of this process is to set up the Python environment you’ll be working in.

Stable Diffusion comes with instructions on how to create a Miniconda environment. If you’re familiar with creating conda environments given an environment.yaml file, you can very well use that. However, I prefer to use simple Python virtual environments built with venv and install packages with pip.

My machine is an Ubuntu 22.04 system with Python 3.8.10. You’ll need to be working with at least Python 3.8.5 for this. So, first things first, clone the repository:

git clone https://github.com/dpoulopoulos/stable-diffusion.git -b feature-dimpo-local-deployment

Note that this is not the original Stable Diffusion repository. This is a fork I created to modify the code a bit. Those modifications allow you to run the project even if you don’t have access to a GPU. The project’s original code is located at https://github.com/CompVis/stable-diffusion.

Go into the project’s directory and create a Python virtual environment:

cd stable-diffusion && python3 -m venv venv

Next, activate your virtual environment and install the version of pip required by the project:

source venv/bin/activate && pip3 install --upgrade pip=20.3

As we said, Stable Diffusion comes with a conda environment that you can use to install the dependencies. I cannot leverage that; thus, I need to create a separate requirements.txt file.

If you’ve cloned the branch I provided in the first step, you don’t need to run this step. A requirements.txt file is already present in the repository. If not, create one using the command below:

cat << EOF > requirements.txt
numpy==1.19.2
--extra-index-url https://download.pytorch.org/whl/cpu
torch==1.11.0+cpu
torchvision==0.12.0+cpu
albumentations==0.4.3
diffusers
opencv-python==4.1.2.30
pudb==2019.2
invisible-watermark
imageio==2.9.0
imageio-ffmpeg==0.4.2
pytorch-lightning==1.4.2
omegaconf==2.1.1
test-tube>=0.7.5
streamlit>=0.73.1
einops==0.3.0
torch-fidelity==0.3.0
transformers==4.19.2
torchmetrics==0.6.0
kornia==0.6
-e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
-e git+https://github.com/openai/CLIP.git@main#egg=clip
EOF

Now, I can install the project’s dependencies in my virtual environment:

pip3 install -r requirements.txt

Finally, I need to install the Stable Diffusion library itself:

pip3 install -e .

Getting the Model

Now that I have set my environment, I need to get the model weights. The checkpoints of the model weights are hosted by Hugging Face, and to download them, you need an account. So, go create one on their page. Don’t forget to verify your email afterward!

Your next step is to download the model. Go to this address: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original and agree to share your contact information:

Screenshot by Author

Now download the model. You’ll see two checkpoint files there. As sd-v1-4.ckpt and an sd-v1-4-full-ema.ckpt. The ema term stands for Exponential Moving Average and is meant as a checkpoint for resuming training. For our purposes, we want to download the smaller sd-v1-4.ckpt one for inference, although there is a way to also use the ema one. However, that’s not something to discuss here. Let’s keep things simple.

Screenshot by Author

Depending on your internet connection, downloading the model will take ~5–10 minutes. It is ~4GB, after all. When this process is done, create a stable-diffusion-v1 folder in your project’s home to place it in:

mkdir models/ldm/stable-diffusion-v1

Move the model inside the stable-diffusion-v1 folder and rename it to model.ckpt:

mv /path/to/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt

Run the Model

At this point, you have everything in place. To create your first stunning piece of art, run the following command:

python3 scripts/txt2img.py --prompt "An astronaut riding a horse, painted by Pablo Picasso." --plms --n_iter 5 --n_samples 1

This will generate an image of “an astronaut riding a horse, painted by Pablo Picasso”. You’ll find the results in a folder called outputs at the end of the process. There, you’ll see five images. This is because you instructed the model to sample five images with --n_iter 5.

Here’s the result I got:

An astronaut riding a horse, painted by Pablo Picasso — Image generated by Stable Diffusion

This is quite astonishing, don’t you agree?

No GPU? No problem!

If you don’t have a GPU, the above script won’t run. Unfortunately, the project requires you to have a GPU device and set up CUDA to run it.

However, the fun does not need to stop here. The repository and branch I provided you in the first step can also run Stable Diffusion on the CPU. The price you pay is that it will take a bit of time to complete.

To run it on CPU, just add a --config flag to the command you executed before, and point to a different config file:

python3 scripts/txt2img.py --prompt "An astronaut riding a horse, painted by Pablo Picasso." --plms --n_iter 5 --n_samples 1 --config configs/stable-diffusion/v1-inference-cpu.yaml

Summary

Much like DALL·E 2, Stable Diffusion is an open-source AI system that creates realistic images and art from a description in natural language. It combines concepts, attributes, and styles to create unique images or device variations of existing pictures.

This story demonstrated how you could create incredible art on your system in under 10 minutes, even if you do not have a GPU device. The best part is that this is software that you can control and run every time you want to generate an image you can put to words. Just be thoughtful of what you generate!

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: