A few people said they found my DALLE3 process notes useful, so I thought I'd share another tip here. This one has an interesting overlap with narrative writing.
It's essentially how to get DALLE to go from this:
To this:
In my experience, students tend to either under or over-describe in stories; they'll either provide no detail or a wall of adjectives.
In narrative writing, we often discuss specific or compelling details—the little things that make a character, place, or action seem real.
The same idea applies when working with DALLE3:
- ChatGPT will write your DALLE prompts, but it tends to use abstractions and generalisations which can create overly cliched or cookie-cutter images.
- You can get better results by directing ChatGPT to use specific, compelling details.
"Conspiracy theorist" example
For example, when working on our Cartoonish images, I asked ChatGPT for a list of story beats based on cute aliens crash-landing near an observatory.
ChatGPT suggested we show some conspiracy theorists breaking into the observatory where the aliens are being sheltered by astronomers.
Great idea, ChatGPT!
Unfortunately, here was the first result:
The main problem is that the conspiracy theorists look like they've been clone stamped.
How do we art-direct ChatGPT & DALLE to get a better result?
Diagnosing the problem
First, let's take a look at the prompt that ChatGPT wrote for DALLE:
To me, it's immediately obvious that the description "conspiracy theorists in dark outfits" is too general.
Getting more specific
So let's tell DALLE to be more specific:
This gets the following result:
Much better, but the wrong art style: these characters look too adult for our cartoonish theme.
Tuning the art style
We need a more naive style:
This leads us to:
These are looking pretty good! Each image shows a distinctive character clearly snooping around. (I've actually skipped a step where I suggested that DALLE give every character a hat and sunglasses even though it's night.)
However, it still feels to me like there is something missing. They almost look too good to be conspiracy theorists in my mind.
What detail would make them seem more unhinged?
"Anxious & underslept"
After a bit of fiddling around, I hit on the idea of describing conspiracy theorist characters as anxious and/or underslept. For example:
And that made all the difference:
Perfectly paranoid (and, in this instance, for good reason!).
My standard instructions
Because ChatGPT will so often write in general terms, I wrote a set of standard "cartoonish image" instructions that I would use at the beginning of each image generation session:
Occasionally, you need to remind the model
Whenever I present at teacher conferences about generative AI tools, I always bang on about the need to learn some basic concepts so that you can intuit why a model might behave a particular way.
One such concept is "context window", which is essentially a model's working memory. As the chat goes on, details in earlier messages slip out of the context window, so you need to remind the model of your priorities.
When making my cartoonish images, I would do that by periodically reposting my standard instructions.
(And if you already know ChatGPT well, then yes, this could be a persistent custom instruction but a) I didn't want to do that and b) I find ChatGPT ignores custom instructions half the time anyway.)