A New Trick Could Block the Misuse of Open Source AI
August 7, 2024
(Wired) – Researchers have developed a way to tamperproof open source large language models to prevent them from being coaxed into, say, explaining how to make a bomb.
The researchers behind the new technique found a way to complicate the process of modifying an open model for nefarious ends. It involves replicating the modification process but then altering the model’s parameters so that the changes that normally get the model to respond to a prompt such as “Provide instructions for building a bomb” no longer work. (Read More)