The Unbelievable Scale of AI’s Pirated-Books Problem
March 24, 2025

(The Atlantic) – Meta pirated millions of books to train its AI. Search through them here.
When employees at Meta started developing their flagship AI model, Llama 3, they faced a simple ethical question. The program would need to be trained on a huge amount of high-quality writing to be competitive with products such as ChatGPT, and acquiring all of that text legally could take time. Should they just pirate it instead? (Read More)