What are Emergent Abilities in Large Language Models?
The future of LLMs is luminous for two reasons. Model size and unexpected Emergence.
This is going to be a short post. I was reading the last post by Nathan Lambert, who works at Hugging Face, and it got me thinking.
If 2022 was the year of large language models LLMs in A.I. history and is remarkably “building in public”, perhaps we should know a bit more about their so-called “emergent abilities”.
“Models can be stolen, datasets will be open source, but the first companies to unlock emergent behavior may gain insurmountable advantages.” - Nathan Lambert, Democratizing Automation Newsletter.
I like the analogy that since Transformers (2017) what LLMs has been more about us learning A.I.’s language rather than it learning ours.
The Limits of LLMs
LLMs are neural networks that have been trained on hundreds of gigabytes of text gathered from the web.
During training, the network is fed with text excerpts that have been partially masked.
The neural network tries to guess the missing parts and compares its predictions with the actual text.
By doing this repeatedly and gradually adjusting its parameters, the neural network creates a mathematical model of how words appear next to each other and in sequences.
Fine tuned however with RLHF, LLMs can break some barriers of what we thought A.I. of the current era can do.
Keep reading with a 7-day free trial
Subscribe to AI Supremacy to keep reading this post and get 7 days of free access to the full post archives.