AI’s Memorization Crisis

January 13, 2026

Bookshelves in Trinity Library in Dublin

(The Atlantic) – On Tuesday, researchers at Stanford and Yale revealed something that AI companies would prefer to keep hidden. Four popular large language models—OpenAI’s GPT, Anthropic’s Claude, Google’s Gemini, and xAI’s Grok—have stored large portions of some of the books they’ve been trained on, and can reproduce long excerpts from those books.

In fact, when prompted strategically by researchers, Claude delivered the near-complete text of Harry Potter and the Sorcerer’s Stone, The Great Gatsby, 1984, and Frankenstein,in addition to thousands of words from books including The Hunger Games and The Catcher in the Rye. Varying amounts of these books were also reproduced by the other three models. Thirteen books were tested.

This phenomenon has been called “memorization,” and AI companies have long denied that it happens on a large scale. (Read More)

Posted by Bioethics Pundit

Posted in Artificial Intelligence, highlights, Informed Consent, News