How do AI image generators work?

All these AI image generators take a text prompt and then turn it—as best they can—into a matching image. This opens up some wild possibilities since your prompt can be anything from "an impressionist oil painting of a Canadian man riding a moose through a forest of maple trees" to "a painting in the style of Vermeer of a large fluffy Irish wolfhound enjoying a pint of beer in a traditional pub" or "a photograph of a donkey on the moon."

Seriously, the only real limits are your imagination, the AI image generator's ability to comprehend your prompt, and any content filters put in place to stop plagiarism, copyright infringement, and bad actors flooding the internet with AI-generated violence or other NSFW content. (That Vermeer prompt used to work reliably, but some image generators now block it because it uses a named artist.)

Most AI image service generators work in a pretty similar way. Millions or billions of image-text pairs are used to train a neural network (basically, a very fancy computer algorithm modelled loosely on the human brain) on what things are. By allowing it to process near-countless images, it learns what dogs, the colour red, Vermeers, and everything else are. Once this is done, you have an AI that can interpret almost any prompt—though there is a skill in setting things up so it can do so accurately.

The next step is to render the AI-generated image. The latest generation of AI image generators does that using a process called diffusion. In essence, they start with a random field of noise and then edit it in a series of steps to match their interpretation of the prompt. It's kind of like looking up at a cloudy sky, finding a cloud that looks kind of like a dog, and then being able to snap your fingers to keep making it more and more dog-like.

