Apple, Microsoft Shrink AI Models to Improve Them
June 21, 2024
(IEEE Spectrum) – Tech companies have been caught up in a race to build the biggest large language models (LLMs). In April, for example, Meta announced the 400-billion-parameter Llama 3, which contains twice the number of parameters—or variables that determine how the model responds to queries—than OpenAI’s original ChatGPT model from 2022. Although not confirmed, GPT-4 is estimated to have about 1.8 trillion parameters.
In the last few months, however, some of the largest tech companies, including Apple and Microsoft, have introduced small language models (SLMs). These models are a fraction of the size of their LLM counterparts and yet, on many benchmarks, can match or even outperform them in text generation. (Read More)