Anthropic, the artificial intelligence company behind the Claude AI chatbot, has agreed to a $1.5 billion settlement in a class-action lawsuit brought by authors and publishers. The plaintiffs alleged that Anthropic used copyrighted books without permission to train its AI models. This landmark settlement, potentially the largest copyright recovery in U.S. history, highlights the growing legal battles between AI companies and copyright holders over the use of creative works in AI training.
The lawsuit, Bartz et al. v. Anthropic PBC, centered on Anthropic's use of copyrighted material obtained from "shadow libraries," including Library Genesis (LibGen) and Pirate Library Mirror (PiLiMi). These websites offer pirated copies of books, and the plaintiffs argued that Anthropic's unauthorized downloading and use of these materials constituted copyright infringement. Judge William Alsup previously ruled that while training AI models on legally acquired copyrighted works could be considered fair use, using pirated materials was "inherently, irredeemably infringing".
Under the terms of the settlement, which still requires court approval, Anthropic will pay approximately $3,000 per book to authors and publishers whose works were used. The settlement is estimated to cover around 500,000 titles. The company has also agreed to destroy any copies of books that were obtained through torrenting or downloading from LibGen and PiLiMi.
The settlement addresses past conduct and does not grant Anthropic the right to use the covered works for future training. This means that Anthropic will need to legally acquire copyrighted works if it wishes to use them for training its AI models in the future.
The agreement has been hailed as a victory for authors and copyright holders. Justin Nelson, an attorney representing the plaintiffs, called the settlement the "largest copyright recovery ever" and "the first of its kind in the AI era". Maria A. Pallante, president and CEO of the Association of American Publishers (AAP), stated that the settlement sends a message to AI companies that "copying books from shadow libraries or other pirate sources to use as the building blocks for their businesses has serious consequences".
The Anthropic settlement is the first in a series of lawsuits against AI companies, including OpenAI, Microsoft, and Meta, over the use of copyrighted material. These cases are closely watched as they could set precedents for how copyright law applies to AI training. Meta recently won a ruling in a similar case. The New York Times is also pursuing a lawsuit against OpenAI, alleging copyright infringement related to the use of its articles in training ChatGPT.
This settlement highlights the importance of "data diligence" for AI companies. Relying on unlicensed or questionable data sources can lead to significant legal and financial repercussions. The case underscores the need for AI developers to respect copyright law and seek proper licenses for the materials they use to train their models.
Anthropic has also made changes to its data privacy policies, now giving users the option to allow their chats with Claude to be used for training future AI models or to opt out and keep their data private. If users do not make a choice by September 28, 2025, they will lose access to Claude. The company says that user data is crucial to improving Claude's coding, reasoning, and safety capabilities, and that it employs automated filters to remove sensitive information before training.