Some artists have begun waging a legal fight against the alleged theft of billions of copyrighted images used to train AI art generators and reproduce unique styles without compensating artists or asking for consent.
A group of artists represented by the Joseph Saveri Law Firm has filed a US federal class-action lawsuit in San Francisco against AI-art companies Stability AI, Midjourney, and DeviantArt for alleged violations of the Digital Millennium Copyright Act, violations of the right of publicity, and unlawful competition.
The artists taking action—Sarah Andersen, Kelly McKernan, Karla Ortiz—”seek to end this blatant and enormous infringement of their rights before their professions are eliminated by a computer program powered entirely by their hard work,” according to the official text of the complaint filed to the court.
Using tools like Stability AI’s Stable Diffusion, Midjourney, or the DreamUp generator on DeviantArt, people can type phrases to create artwork similar to living artists. Since the mainstream emergence of AI image synthesis in the last year, AI-generated artwork has been highly controversial among artists, sparking protests and culture wars on social media.
One notable absence from the list of companies listed in the complaint is OpenAI, creator of the DALL-E image synthesis model that arguably got the ball rolling on mainstream generative AI art in April 2022. Unlike Stability AI, OpenAI has not publicly disclosed the exact contents of its training dataset and has commercially licensed some of its training data from companies such as Shutterstock.
Despite the controversy over Stable Diffusion, the legality of how AI image generators work has not been tested in court, although the Joesph Saveri Law Firm is no stranger to legal action against generative AI. In November 2022, the same firm filed suit against GitHub over its Copilot AI programming tool for alleged copyright violations.
Tenuous arguments, ethical violations
Alex Champandard, an AI analyst that has advocated for artists’ rights without dismissing AI tech outright, criticized the new lawsuit in several threads on Twitter, writing, “I don’t trust the lawyers who submitted this complaint, based on content + how it’s written. The case could do more harm than good because of this.” Still, Champandard thinks that the lawsuit could be damaging to the potential defendants: “Anything the companies say to defend themselves will be used against them.”
To Champandard’s point, we’ve noticed that the complaint includes several statements that potentially misrepresent how AI image synthesis technology works. For example, the fourth paragraph of section I says, “When used to produce images from prompts by its users, Stable Diffusion uses the Training Images to produce seemingly new images through a mathematical software process. These ‘new’ images are based entirely on the Training Images and are derivative works of the particular images Stable Diffusion draws from when assembling a given output. Ultimately, it is merely a complex collage tool.”
In another section that attempts to describe how latent diffusion image synthesis works, the plaintiffs incorrectly compare the trained AI model with “having a directory on your computer of billions of JPEG image files,” claiming that “a trained diffusion model can produce a copy of any of its Training Images.”
During the training process, Stable Diffusion drew from a large library of millions of scraped images. Using this data, its neural network statistically “learned” how certain image styles appear without storing exact copies of the images it has seen. Although in the rare cases of overrepresented images in the dataset (such as the Mona Lisa), a type of “overfitting” can occur that allows Stable Diffusion to spit out a close representation of the original image.
Ultimately, if trained properly, latent diffusion models always generate novel imagery and do not create collages or duplicate existing work—a technical reality that potentially undermines the plaintiffs’ argument of copyright infringement, though their arguments about “derivative works” being created by the AI image generators is an open question without a clear legal precedent to our knowledge.
Some of the complaint’s other points, such as unlawful competition (by duplicating an artist’s style and using a machine to replicate it) and infringement on the right of publicity (by allowing people to request artwork “in the style” of existing artists without permission), are less technical and might have legs in court.
Despite its issues, the lawsuit comes after a wave of anger about the lack of consent from artists that feel threatened by AI art generators. By their admission, the tech companies behind AI image synthesis have scooped up intellectual property to train their models without consent from artists. They’re already on trial in the court of public opinion, even if they’re eventually found compliant with established case law regarding overharvesting public data from the Internet.
“Companies building large models relying on Copyrighted data can get away with it if they do so privately,” tweeted Champandard, “but doing it openly *and* legally is very hard—or impossible.”
Should the lawsuit go to trial, the courts will have to sort out the differences between ethical and alleged legal breaches. The plaintiffs hope to prove that AI companies benefit commercially and profit richly from using copyrighted images; they’ve asked for substantial damages and permanent injunctive relief to stop allegedly infringing companies from further violations.
When reached for comment, Stability AI CEO Emad Mostaque replied that the company had not received any information on the lawsuit as of press time.