- BuzzRobot
- Posts
- Google DeepMind’s Griffin architecture: A challenger to the Transformer? Virtual talk
Google DeepMind’s Griffin architecture: A challenger to the Transformer? Virtual talk
Hello, fellow human! If you are interested in exploring what other architectures are out there besides Transformers and what other state-of-the-art research is happening, join our upcoming virtual talks.
And if you are in the Bay Area, California, and would like to mingle with the AI community, we are working on our in-person event. Stay tuned for the details!
In my recent conversation with Thomas Scialom, a researcher from Meta who leads Llama models research, he mentioned that if Google hadn’t made the Transformer architecture public, most likely even inside the company they would still be using LSTMs. Indeed, back in 2018, Transformers completely changed the game. Actually, I was a scholar at OpenAI at that time – it was very exciting to explore Transformers under the mentorship of Alec Radford.
Since then, Transformers have taken over, and many AI companies have built their entire tech stack around Transformer-based models. However, there are still challenges to address, for example, how to reduce computational costs or increase the context length.
It's not surprising that in the last six months, several Transformer alternatives have been introduced, and Griffin is one of them.
On July 11th, our guest, Aleksandar Botev, a Research Scientist at Google DeepMind, will share with the BuzzRobot community the details of the novel Griffin architecture for sequential models and its advantages over Transformers.
This Thursday, we are hosting a researcher from Cohere who will speak about a new optimization method for RLHF (Reinforcement Learning from Human Feedback) — a crucial component of large language models' high performance.
Reply