Commercialize DALL·E 2: How to Build and Sell Products Right Now



Original Source Here

Commercialize DALL·E 2: How to Build and Sell Products Right Now

Design and deploy your products today to sell them on the market based on your DALL·E 2 inventions.

By Annie Spratt from Unsplash

The current day interest image generation capabilities and their intersection with machine learning and natural language processing implementation pipeline have shown an extraordinary future of the arts, with little to no human-led design.

Of those not generally aware of some of the milestones achieved by capabilities in generating image from text, these include some of the early risers, those emerging ahead, or trending:

Owned by the Author

In this image, it is intended to represent the Queen of England, speaking to thousands of people.

Parti: Pathways Autoregressive Text-to-Image, is a text-to-image model that can generate “high-fidelity photorealistic” [3] images from textual descriptions. Parti is based on a neural network and uses a sequence-to-sequence (Seq2Seq) architecture (“akin to machine translation, with sequences of image tokens as the target outputs rather than text tokens in another language”) [1]. Moreover, it can be trained on different datasets and can be used for various applications to generate text-to-image mappings for new images (that are not even a part of the training data).

Imagen: Imagen is a text-to-image diffusion model that Google developed. It allows users to search for images using text queries [2]. The benefits of using Imagen include the ability to search for images using natural language, the ability to find similar images, and the ability to discover new images.

Wombo [4]: Personally, I find this capability the most impressing — at least for me, it comes down to cost-benefit analysis to account for value, opportunities to generate different solutions, and the costs financially associated with the output. Though, and for me it is just a gut feeling (and I do not know with certainty), I have tested and found the website to work better (in terms of meeting my actual prompt output requirements) than the app.

Owned by the Author

The intersection of AI and DALL·E 2, the technical simply explained

DALL·E 2 can be used to generate images of objects from textual descriptions, offering a potential tool for creation or visualization in fields such as engineering and architecture.

Given a description of a scene, DALL·E 2 can generate an image that captures the gist of that scene (e.g., “a crowded beach with lions sleeping”). This capability could be useful for creating illustrations, generating video previews, showing the unknown, revealing the uncertain. In my mind, I think of it as Wombo, but elevated to the nth degree.

Where do we go from here, considering all these options are now coming online, entering the market, asserting themselves (the models) as to how one is to think idealize or conceptualize novelties? The magnitude of creativity is coming to understanding how to work within the constraints of these models.

Owned by the Author

In the context of DALL·E 2, since it can understand natural language inputs, it has potentially promising applications as a chatbot or digital assistant. Specifically, one could ask it to “find me a picture of a house in Capri, Italy” and it would retrieve relevant images from the internet.

Think about text-to-image synthesis for data augmentation in machine learning applications. If there is insufficient training data available for Recognizing Textual Entailment (RTE) models [5], DALL·E 2’s outputs can quite literally develop new synthetic examples automatically.

This is really what the crux of it is in the context of image generation for where are right now (not so much the use cases that I briefly scratched the surface on as to where we can be headed): (1) the opportunity is to provide an input to a model, like DALL·E 2 or with Wombo, (2) use that input to describe an image, which has not been seen by our model before (“two chess players running through snow in Venice, Italy”), and (3) the model returns high quality corresponding pictures.

By Simon Hurry from Unsplash

First, become aware of a few landing places on the internet regarding DALL·E 2:

— Reddit: https://www.reddit.com/r/dalle2/
— The actual research (because we always cite our sources): https://arxiv.org/abs/2204.06125
— The current landing page to : https://openai.com/dall-e-2/

Then, the cost

Straight from OpenAI:

“Users can create with DALL·E using free credits that refill every month, and buy additional credits in 115-generation increments for $15” [6].

What is Free?

“Every DALL·E 2 user will receive 50 free credits during their first month of use and 15 free credits every subsequent month. Each credit can be used for one original DALL·E 2 prompt generation — returning four images — or an edit or variation prompt, which returns three images” [6].

Owned by the Author

What is going to send *shockwaves* across the world of the arts

“Starting today, users get full usage rights to commercialize the images they create with DALL·E, including the right to reprint, sell, and merchandise. This includes images they generated during the research preview.” Also,

“Users have told us that they are planning to use DALL·E images for commercial projects, like illustrations for children’s books, art for newsletters, concept art and characters for games, moodboards for design consulting, and storyboards for movies.”

Consider the costs of paying into websites that allow you to use their “premium” images by artists. Back to the cost-benefit, it is inevitable the extraordinary impact these last two statements are going to have, in professions, in how creators develop images, in technologies that can hinge to or supplement across such innovations (like DALL·E 2).

It is likely most of you will not be able to gain access to DALL·E 2 just yet. I will guess that tens of millions of people have joined the waitlist. In the meantime, you can “evaluate” in concept what some of the possibilities are using Wombo. Analogously speaking, many of the use cases you have considered can be implemented with Wombo (until you can land in DALL·E 2).

Owned by the Author

Commercialize it

You can now essentially sell what you generate. Consider special occasions that people experience. What if the user wants to have a physical memory of that moment without necessarily having an image of it? People want to remember weddings, places they have traveled, the animals they once had, the people they met (and what they were doing during that time).

It is less about a “painting” and more about an image of your general ideas that can enable opportunities in the startup scene.

I was speaking to my wife about generating images, the exact location around a specified time, specific down to the weather conditions, based on our recollection of our memory of what was happening in that moment.

I discussed with her where we had a wedding, the incredible backdrop (the trees), and the chance to be able to “relive” that through artistry, disassociated with any physical photos.

Speed to market is going to significantly accelerate. The apple logo took several weeks to generate: it required one person, who spent a considerable amount of time thinking about its final version, eventually resulting (again, several weeks later) in the appearance of THE Apple logo. What if we could ideate and conduct large-scale visualization discoveries for branding over several seconds to minutes?

Take into account the organizations that exist today: they will receive your images, they provide you with the supply chain end-to-end, like hosting your work, printing them in the user’s desired format, shipping them, and providing you with analytics on how you are doing. If you could receive requests from users for creating experiences visually, you can also begin selling them — right now — through specific idea channels.

Owned by the Author

Conclusion

I would not wait a second to start experimenting with these tools. The opportunity to achieve findings through your thoughts, to be able to receive results to words based on your contextual reasoning. Think about the number of business ventures that can prop up to operationalize an entire supply chain based on images. Take Wombo right now, for instance. While they allow you to utilize their in-house capabilities to purchase your prints, you are not hindered from only employing their services.

I am more excited about how people are going to scale products (which are the solutions they generate using these tools), from inception to go-to-market, and attempt to reach audiences for monetization.

I am *most* excited about how artists are going to exploit in beneficial ways the mechanics of these tools. Many arguments will flare up on many sides of the spectrum as to benefits and disadvantages — this is without a doubt. Still, what stops artists from creating, designing, operationalizing, and deploying?

Consider sharing your thoughts with me if you have any edits/revisions to recommend or recommendations on further expanding this topic.

Also, please kindly consider subscribing to my weekly newsletter:

I have written about the following related to this post; they may be of similar interest to you:

References.

1. Yu, J., Xu, Y., Koh, J. Y., Luong, T., Baid, G., Wang, Z., Vasudevan, V., Ku, A., Yang, Y., Ayan, B. K., Hutchinson, B., Han, W., Parekh, Z., Li, X., Zhang, H., Baldridge, J., & Wu, Y. (2022, June 22). Scaling autoregressive models for content-rich text-to-image generation. ArXiv.Org. https://arxiv.org/abs/2206.10789.

2. Imagen: Text-to-Image diffusion models. (n.d.). Retrieved July 22, 2022, from https://imagen.research.google

3. google-research. (n.d.). GitHub — Google-research/parti. GitHub. Retrieved July 22, 2022, from https://github.com/google-research/parti

4. WOMBO dream. (n.d.). AI Powered Artwork Tool. Retrieved July 22, 2022, from https://www.wombo.art

5. Dagan, I. (2013, July 19). Recognizing textual entailment: Models and applications. Synthesis Lectures on Human Language Technologies. https://www.morganclaypool.com/doi/abs/10.2200/s00509ed1v01y201305hlt023

6. OpenAI. (2022, July 20). DALL·E now available in beta. OpenAI. https://openai.com/blog/dall-e-now-available-in-beta/

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: