Sarah Silverman Sues Meta and OpenAI for Copyright Infringement

Comedian Sarah Silverman, who first rose to fame for her stint on the program “Saturday Night Live” in the early 1990s, released a memoir called “Bedwetter” in 2010. Now, she’s accusing OpenAI and Meta of using text from her book to train language models without her consent. (Credit: Wikimedia Commons)

Comedian Sarah Silverman is suing Meta and OpenAI for copyright infringement, alleging the companies’ artificial intelligence language models have been “knowingly and secretly trained” using text from her 2010 memoir “Bedwetter.”

She is joined in the suits by best-selling authors Richard Kadrey and Christopher Golden, all of whom claim AI programs used their copyrighted material without consent, credit or compensation.

Neither Meta nor OpenAI have publicly commented on the pair of suits, which were filed Friday in federal court in San Francisco. OpenAI is the creator of ChatGPT, the AI sensation that burst onto the scene last fall.

The suits claim Meta and Open AI obtained the authors’ works illegally from “shadow libraries” – online databases containing content that is normally inaccessible due to paywalls or copyright controls – like Bibliotik, Library Genesis and Z-Library.

When prompted to summarize Silverman’s book, ChatGPT offers an in-depth synopsis, without acknowledging any copyright information, according to the suit. That is also the case for Kadrey’s book “Sandman Slim” and Golden’s book “Ararat.”

Meanwhile, Meta scraped the books from shadow libraries and used the information to train LLaMA, its language model launched earlier this year, the authors allege in the suit.

These are not the first lawsuits of their kind. Last month, authors Mona Awad (“Bunny”) and Paul Tremblay (“The Cabin at the End of the World”) also sued, claiming OpenAI used their copyrighted works to train ChatGPT without authorization.

The authors have all been represented by attorneys Joseph Saveri and Matthew Butterick.

“Since the release of OpenAI’s ChatGPT system in March 2023, we’ve been hearing from writers, authors and publishers who are concerned about its uncanny ability to generate text similar to that found in copyrighted textual materials, including thousands of books,” Saveri and Butterick state on their website.

The lawyers added that such incidents are becoming more common, as books are often preferred as a source for AI programmers to draw data from. In a recent study from researchers at MIT and Cornell, books ranked in the top percentage of training data, as they contain the “longest, most readable” material with “meaningful, well-edited sentences.”

Neither Silverman, Kadrey nor Golden have spoken publicly about the lawsuits, which seek statutory damages and restitution of profits.

Related

ChatGPT and the Hidden Bias of Language Models

Related

New Investigation Reveals AI Tools Are Sexualizing Women’s Bodies in Photos

The Story Exchange

Follow