In recent years, artificial intelligence (AI) engines have been leveraged to increase the efficiency and scalability of everything from fraud prevention to natural language processing (NLP). The typical use cases of AI have involved processing and managing existing data — but with the rise of automated art, AI is now being used to create something novel. Is AI image generation the future of creativity or a dangerous Pandora’s Box? Let’s explore.
The training of AI image models
Not unlike the process of training NLP engines using sentence completion exercises on large volumes of digital text, training AI image generators require a large volume of image data to educate the engine.
AI image models are fed an expansive sampling of internet images — including any associated captions or metadescriptions — and asked to select the correct caption for an image from a set of options. After receiving enough image/text pairings, the AI image model can create new images based on increasingly complicated written prompts - even quite surreal descriptions:
"A small cactus wearing a straw hat and neon sunglasses in the Sahara desert."
The major players in AI image generation
It should come as no surprise that AI image generation has been pioneered by industry leaders like OpenAI and Google.
OpenAI’s machine learning model was announced in 2021. DALL-E (a portmanteau of the movie “WALL-E” and surrealist artist Salvidor Dali, according to Wikipedia) is used to produce digitally-rendered images from natural language descriptions. OpenAI began beta-testing DALL-E 2 in July 2022, and by September TechCrunch reported DALL-E 2 customers could upload and edit human faces. We’ll get to the risks and potential ramifications of this in a moment.
In a bid to unseat OpenAI’s hold on the AI image generation market, Google announced its AI-powered text-to-image generator, Imagen, in May 2022. However, The Verge reports that despite Google’s claims “Imagen produces consistently better images than DALL-E 2,” the technology hasn’t been made public yet — and for good reason. “Although text-to-image models certainly have fantastic creative potential, they also have a range of troubling applications. Imagine a system that generates pretty much any image you like being used for fake news, hoaxes, or harassment, for example."
"As Google notes, these systems also encode social biases, and their output is often racist, sexist, or toxic in some other inventive fashion.”
Ethical issues, security risks, and other concerns of AI-generated images
Thus far, public access to AI image generators has been limited, with the exception of Craiyon (formerly DALL-E Mini), a lite version of DALL-E that allows internet users to input text and receive their custom AI art within two minutes or less. There are myriad reasons for limiting access to AI image generators, but here are some of the most prominent:
Ethical concerns — AI-image generation is rife with potential ethical abuses. Even Craiyon, with its significantly limited engine, is capable of generating some objectively awful content. When Futurism tested Craiyon with “a series of prompts ranging from antiquated racist terminology to single-word inputs,” they found the resulting images were often “stereotypical or outright racist.”
In order to avoid the creation of divisive, unethical content, AI image generators will need to implement strong guardrails (such as text filters) before the technology becomes more widely available.
- Security risks — AI’s ability to replicate human faces comes with all sorts of threats to personal privacy and security, including deepfakes and scams. MIT Technology Review warns unfettered access to AI image generators can “give malicious actors tools to generate harmful content at scale with minimal resources.”
- Legal challenges — In July 2022, DALL-E 2 gave its users the right to use their AI-generated images commercially. This, of course, opens the door to issues of copyright infringement and intellectual property. To get ahead of potential legal issues, Getty Images went so far as to ban AI-generated images in September 2022. Speaking to The Verge about the decision, Getty Images CEO Craig Peters stated, “There are real concerns with respect to the copyright of outputs from these models and unaddressed rights issues with respect to the imagery, the image metadata and those individuals contained within the imagery.”
- Impact on creators — Artists and designers are rightfully worried that AI image generators might complicate compensation, and even put them out of a job. MIT Technology Review shares the story of fantasy artist Greg Rutkowski, who has become one of the most popular prompts in AI art generator, Stable Diffusion. Though Rutkowski’s style of art has been used as a prompt far more times than Picasso, he’s less than thrilled by this newfound popularity, believing AI image generators “could threaten his livelihood.” Moreover, Rutkowski says “he was never given the choice of whether to opt-in or out of having his work used this way,” reiterating concerns over artistic compensation and consent.
A new disruptor on the horizon
While industry experts continue to weigh the benefits and risks of AI art, the technology continues to evolve, sometimes in alarming directions. The Verge reports that Stable Diffusion — developed by Stability AI — is a new “open-source, unfiltered image generation, that’s free to use for anyone with a decent computer and a little technical know-how.” (An online library of examples created by SD can be searched here: https://lexica.art/)
In addition to being readily available to the public, Stable Diffusion is unique for its “hands-off approach to moderation. Unlike DALL-E, it’s easy to use the model to generate imagery that is violent or sexual; that depicts public figures and celebrities; or that mimics copyrighted imagery, from the work of small artists to the mascots of huge corporations.” Despite these concerns, Forbes reports Stability AI “previously raised at least $10 million in SAFE notes (a form of convertible security popular among early-stage startups) at a valuation of up to $100 million.”
With the advent of Stable Diffusion, AI’s Pandora’s Box isn’t just open, it’s open-sourced. Industry leaders will need to continue grappling with these complicated questions of consent, security, and copyright to determine if AI art is the future of creativity — or its greatest threat.