Original Source Here
Mid (user) journey: the UX of AI-generated art
Midjourney is an AI program that generates art based on prompts users write. It is a godsend for anyone that thinks they are creative, but also has no talent for drawing… to the point that it‘s pathetic (me). Some people in life opt to be UX designers, others UX researchers, and drawing is one of the reasons why. However, in 2022, if I am inspired to make art, instead of painting on a canvas about a beautiful night, or opening illustrator, I can give the prompt “Magical gorgeous starry night, glowing water”
It takes seconds instead of hours to make art. In addition to handling landscapes, with the 4th version of Midjourney, it has been able to create portraits with some good success:
As I was going through Midjourney for the first time, being completely blown away, I decided to document my experience, which I’m going to share here. Instead of doing a mapping exercise, this user journey will be formatted as an article, and completely written out.
When my brother introduced me to Midjourney via a text message, I didn’t really know what to expect. He’s pretty technical, so I wasn’t sure of the learning curve. From my perspective, hearing “AI-generated art” sounded daunting. Despite this, he exclaimed it was really easy to get started, and that there is a free trial. He asked me to come up with a piece of art that was random in order to prove its merit. I told him that I wanted a “Giant medieval battle with orangutans versus flying walruses in Disneyland with an asteroid in the backdrop:
The art he showed me seemed cooler than anything I could draw, and although it didn’t look close to what I imagined, it was intriguing.
The goals for my first time on the platform were roughly like the following, in this order:
- Figure out where to navigate to in order to use Midjourney.
- Once I‘m on the platform, determine which actions I need to take in order to generate art.
- “Pilot” some test art to see what the limitations are of the program.
To sum it up, I wanted to create some basic art, but wasn’t sure where to get started, or what the system’s constraints were.
The Two Tasks — Pain points — Points of Ease
Task 1 [Accept Discord Invite and Join Channel]: Luckily for me, I already had a Discord account. For all the people reading this that don’t know, Discord is a voice chat (VOIP) and messaging platform that also allows users to create channels. Discord has a lot of integrations and bots which can help facilitate a multitude of functions, or in this case, AI art generation. I went ahead and accepted the invite where I was brought to the Midjourney Discord server.
I had no idea where to begin, so I messaged a more experienced user, and he told me to join one of the “newbie” channels. Once I joined, I was amazed at what I saw; again, and again, art was being rapidly made by various people in this message channel. The ideas varied from portraits to landscapes, and everything in between. You could tell some people’s political views based on the satirical art they would generate from prompts.
Task 1 Pain Points: As soon as I accepted the invite the channel I was brought to a welcome page with a series of URLS/links that completely overwhelmed me. It seemed like a rabbit hole that I didn’t want to partake in, and I got excited by the art that my brother showed me so I just wanted to get started.
- Without the help of a more experienced user, it might have taken me a bit to realize I needed to join the newbie channel since there are a lot of links to click through, and lines of text to parse.
Task 1 Points of Ease:
- The onboarding process was very convenient in that I didn’t need to create a new user name and password on separate site to get started, since I already had a Discord account.
- Having the AI art generator as just another Discord server on my list of other servers actually minimized context switching and maximized consolidation. It’s always preferable to have a single platform that has similar navigational/interaction/experiential patterns, compared to multiple disparate platforms. There’s even research to show that there is a “toggle tax”, which is a result from drastic context switching.
- The terms of service (TOS) are very simple and generous. Regular individuals and small companies are allowed to use Midjourney as long as its not for inflammatory purposes, and also can’t have gore or nudity. Furthermore, a user can utilize it to make money (even in the trial version) unless they are a business that makes over 1 million dollars a year (in which case they need an enterprise license).
Task 2 [Create First Art]: I went ahead and created my first prompt by typing: “/imagine prompt: Epic surreal fantasy land meadow with orcs elves and wizard.” From there a picture with 4 options were generated (like the above Disneyworld image with walruses). I could choose to make a variation of any of the 4 pictures, or upscale a picture to higher resolution and add more detail. I went ahead and picked U3 to upscale the 3rd choice. In seconds, this was what got generated:
Task 2 Pain Points:
- The regular base version of Midjourney might be completely off (11/14/2022 base version.) Much like my medieval orc, elves and wizard picture, it can just not have the right details (granted version 4 is better, but a new user wouldn’t know to specify that, and I didn’t my first time.)
- Wasn’t entirely sure what U1-U4 or V1-V4 meant at first, along with the other clickable options, some of which were more experimental/beta. I learned through trial and error as well as getting consulting from a more experienced user, but if someone didn’t have another person to ask they’d have to read documentation, rather getting clues from the UI on what each option could do:
- It’s unclear when a prompt should be a full on sentence with grammatical clauses vs a series of words and commas. For example “A cat running into an open meadowland with flowers and rainbows” vs “A cat running, open meadowland, flowers, rainbows.” Furthermore, I have seen users do both in one prompt, and will have a full on sentence with clauses, followed by words with commas separating them.
- Portraits of faces have lot of details wrong, or are asymmetrical. I didn’t know at the time to use version 4, or the upscale beta process that handles faces, body parts, and finer details much more precisely. This is an example of the not as good base version for portraits:
- It’s easy to lose track of where my art was when there was so many people in one channel generating art (think of a group message where people are spamming). It’s easy enough to scroll up, but it’s still pain point. Luckily since I bought 10 dollar basic version after this FTUE, it allowed me to have my own direct message chat with a Midjourney bot, so other peoples’ art won’t be in the chat.
- As I’m going through the trial, it would have been insightful to be notified at certain milestones of my usage (25%, 50%, 75% etc.) There is a way to manually type “/info” which could have given me that information (if I remembered), but with my first time using the platform, I never knew the command.
- Even with the latest version, it won’t get specific characters 100% correct. For example, when I made an image for “body builder Waluigi” (as a request to my friend) it got the face incorrect, as well as mustache:
Task 2 Points of Ease: From a pure usability and learning curve perspective, it’s absolutely mind-blowing how much easier this is to use to create art than actually drawing. Granted there’s technical ability with understanding the limitations of the prompts, but it’s still simpler than making art the old fashion way. There are also the following good points:
- Images are easy to download, and the pictures’ file names are the prompt that I gave it, so I can determine which prompt generated a particular piece of art.
- Midjourney can make interpretations of abstract concepts. I tried a prompt involving elation and happiness, in which this was generated:
- Although the group channel message feed can sometimes be distracting, and bury messages, it’s useful to learn while also gaining inspiration from other people much like the picture below (animal people I guess):
Midjourney and AI are the future when it comes to making art. It’s hard to imagine a world in 10 years where AI isn’t generating most art and then more granular changes are made by artists or even more specialized AI art software. Will AI fully replace artists? Maybe the technical side will be automated away; where drawing/painting/designing is meticulous, coming up with prompts and generating art is way faster. Creativity, however, will never be automated. The program still needs some sort of directive or prompt. Although I’m sure there’s an option to randomly generate pictures, the fun is melding two weird concepts together and seeing what gets created. For example, I wanted something that was beautiful, but also hellish. Scary but beautiful at the same time…
Here are some additional use cases that I think Midjourney can help fulfill today:
- Creating Dungeons and Dragons artifacts including landscapes, other environments for dungeon masters, and character portraits.
- Creating desktop wall papers.
- Creating posters and other physical art with size specifications in the prompt.
I could also imagine AI generated virtual worlds as another future use case if the technology expands into that realm. What is AI going to be capable of in the future? 10 years ago we didn’t have anything like this. It’ll be amazing seeing the progress of the AI art journey.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot