Imagine unlimited creativity at your fingertips

Photo Lee Unkrich, one of Pixar’s most distinguished animators, as a seventh grader. He is staring at an image of a train engine on his school’s first computer screen. Oh!, he thinks. However, some of the magic wears off when Lee discovers that the picture hadn’t appeared by simply asking for “a picture of a train.” Instead, it had to be painstakingly coded and rendered, by hard-working humans.

Now imagine Lee 43 years later, running into DALL-E, an artificial intelligence that generates original artwork based on human-provided suggestions that can literally be as simple as “the picture of a train.” As he types words to create one image after another, the Oh! is back. Only this time he doesn’t leave. “It feels like a miracle,” he said He says. “When the results appeared, I gasped for breath and tears welled up in my eyes. It’s so magical.

Our cars have crossed a threshold. All our lives, we were reassured that computers were incapable of being truly creative. Yet, suddenly, millions of people are now using a new generation of AI to generate stunning, never-before-seen images. Most of these users are not, like Lee Unkrich, professional artists, and here’s the point: they don’t have to be. Not everyone can write, direct and edit an Oscar winner like Story of the toy 3 or Coconutbut all Power launch an AI image generator and type in an idea. What appears on the screen is astounding in its realism and depth of detail. Thus the universal answer: Oh!. On just four services, Midjourney, Stable Diffusion, Artbreeder and DALL-E, humans working with AIs now co-create more than 20 million images every day. With a brush in hand, AI has become an engine of wow.

Because these surprise-generating AIs learned their craft from billions of images made by humans, their output hovers around how we expect the images to be. But because they’re an alien artificial intelligence, fundamentally mysterious even to their creators, they restructure new images in ways no human could think of, filling in details that most of us wouldn’t have the artistry to imagine, lest they talk about the skills perform. They can also be instructed to generate multiple variations of something we like, in whatever style we want, in seconds. This, ultimately, is their most powerful advantage: they can create new things that are recognizable and understandable but, at the same time, completely unexpected.

These new AI-generated images are so unexpected, in fact, that, in the silent amazement that immediately follows the Oh!– another thought occurs to almost everyone who meets them: man-made art must now be finished. Who can compete with the speed, economy, size and, yes, unbridled creativity of these machines? Is art yet another human quest that we have to give in to robots? And the next obvious question: If computers can be creative, what else can they do that we were told they couldn’t?

I’ve spent the past six months using AI to create thousands of amazing images, often losing a night’s sleep in the endless quest to find just one more hidden beauty in the code. And after interviewing the creators, power users, and other early adopters of these generators, I can make a very clear prediction: Generative AI will change the way we design just about everything. Oh, and no human artist will lose their job to this new technology.

And no exaggerated to call images generated with the help of artificial intelligence co-creations. The sobering secret of this new power is that its best applications are the result not of typing a single prompt but of very long conversations between humans and machines. The progress for each image comes from many, many iterations, back and forth, detours, and hours, sometimes days, of teamwork, all in the wake of years of machine learning advances.

AI imagers were born from the union of two separate technologies. One was a legacy line of deep learning neural networks that could generate consistent realistic images, and the other was a natural language model that could act as an interface to the image engine. The two were combined in a language-based image generator. The researchers searched the Internet for all images that had adjacent text, such as captions, and used billions of these examples to connect visual shapes to words and words to shapes. With this new combination, human users could enter a string of words, the prompt, that described the image they were looking for, and the prompt would generate an image based on those words.

Leave a Reply

Your email address will not be published. Required fields are marked *