Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal

January 10, 2025

Meta logo

(Wired) – Meta just lost a major fight in its ongoing legal battle with a group of authors suing the company for copyright infringement over how it trained its artificial intelligence models. Against the company’s wishes, a court unredacted information alleging that Meta used Library Genesis (LibGen), a notorious so-called shadow library of pirated books that originated in Russia, to help train its generative AI language models.

The case, Kadrey et al. v. Meta Platforms, was one of the earliest copyright lawsuits filed against a tech company over its AI training practices. Its outcome, along with those of dozens of similar cases working their way through courts in the United States, will determine whether technology companies can legally use creative works to train AI moving forward and could either entrench AI’s most powerful players or derail them. (Read More)