Quick Take on Text to Image Conversion with AI — using Stable Diffusion



Original Source Here

Quick Take On Text to Image Conversion With AI — Using Stable Diffusion

While the conversion tools have been there for years, Stable Diffusion literally makes it possible for anyone to create photorealistic art!

Courtesy: Stable Diffusion

What is text to image conversion model?

Simplistically, it’s a model which produces matching images to the provided text description as closely as possible. It falls under the domain of generative AI and is one of the use cases for deep learning.

Generative AI

Artificial intelligence, although in its nascent stage, has come a long way to penetrate the way we interact, engage and express. Generative AI is one facet of this evolution that allows algorithms to imagine words and voices into pictures and expressions. It creates unbiased results, which generally result from human thoughts and experiences.

Generative AI refers to artificial intelligence models that can use existing content like text, audio files, or images to create new believable content.

Generative AI models are mostly based on techniques such as generative adversarial networks (GANS), transformers, and variational autoencoders.

AI in art

Although I do not understand much when it comes to art, I am definitely fascinated by the idea of an AI doing it for me!

Recently there was a lot of buzz around an AI-generated art winning an international competition. Although such arts will never surpass the legacy and era of various artists around the globe over history, I believe it will definitely make art more accessible to the masses and carve its own niche.

Jason Allen’s A.I.-generated work, “Théâtre D’opéra Spatial,” took first place in the digital category at the Colorado State Fair. Image Courtesy: NYTimes & Jason Allen

Read the complete article at NYTimes.

This artwork by Mr. Allen was created with Midjourney, another artificial intelligence program that turns text into hyper-realistic graphics.

What is Stable Diffusion, and how it works?

Text-to-image converters have been out there for quite some time now, but tools released this year (2022) — like DALL-E 2, Imagen, Midjourney, and Stable Diffusion — make it possible for almost anyone to create photorealistic works just by typing in some text.

While there are multiple programs out there that support text-to-image conversion, in this article, we explore Stable Diffusion as one of the models. No specific reason of choice as such — but just because I found it simple for a first try out! 😀

Build your own art by providing prompts at the public demonstration space for Stable Diffusion model.

How does it work? From a user’s perspective, that’s pretty straightforward. You type in your imagination in words, and the model will churn out interesting art. It uses a complex process of “diffusion” to turn text into images.

In the case of text-to-image conversion, the model tries to learn the underlying pattern of the input and then uses that info to generate close-fit images. It may not always produce a new image, rather, it tries to reach the closest outcome by mixing and matching the images it already has.

Infinite possibilities

I tried giving a few prompts to the Stable Diffusion model, and this is how it stunned me —

“Cat wearing sunglasses in the bar.”

Created at: Stable Diffusion Public Space

“Colourful horizon in the Indian Ocean. A ship cruising beside a pack of dolphins.”

Created at: Stable Diffusion Public Space

“Carrot in a karate belt.”

Created at: Stable Diffusion Public Space

P.S.: The art only gets better with the expressiveness of your imagination in words. So, write better! 😁

Challenges — Mainstream Blockers

Most models are trained by web scraping images at large and therefore undergo no scrutiny. As of today, while writing of this article, this can lead to potential misuse, unpredictable outcomes, and other ethical problems with the widespread use of this technology.

Although we are not far from a stage where AI becomes capable of doing the majority of human chores, the challenge of modeling ethics into its’ core remains an unsolved puzzle.

Conclusion

Generative AI is one domain that is fast rising to the mainstream as we speak. With its ever-increasing use cases like text-to-image conversion, image-to-image conversion, image resolution enhancements, face aging, photos to emojis, audio synthesis, sentiment analysis, and trend evaluation, it’s a boon to us.

The advances are likely to increase, and generative design techniques are likely to empower the machines to do more than just manual labor and take up creative tasks.

Wrapping it up

Do share in the comments 💬 your thoughts about this super cool generative art model, its future, and how you would like to use this further.

Also do share with me the interesting art you generate with Stable Diffusion. 😃

  • 👏 — send a few claps if this quick round-up helped you in ways
  • 🔗 — do share this article with curious folks looking to explore
  • ➕ — press follow to tune up on more such simplified stuff around cloud, technology, and science

Connect with me on LinkedIn.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: