Google is expected to announce the next generation of its Gemini family of AI models in early December, a year after it revealed Gemini 1. This is expected to be a more significant change than the Gemini 1.5 versions released in May.
According to The Verge, despite being a big step up from Gemini 1, the new model isn't quite as powerful as Google may have liked. This could be because Gemini 1.5 was better than expected, or there is a chance we are reaching a leveling-off point where features begin to matter more than overall performance and capability.
OpenAI has diverged with its models, creating a new o1 family that is good at reasoning but not as good at other tasks. Then there's the GPT-4o (Omni) models, which are more versatile. It is possible that with Gemini 2 Google will follow a similar path.
AI labs have a habit of making huge announcements in the run-up to the holiday season, and then sitting on them until the new year. This is likely to be the case with Gemini 2. I suspect Google will reveal new variants of Ultra and Pro but that they won't reach the Gemini app until 2025.
What can we expect from Gemini 2?
Each new generation of the model brings with it new capabilities, new training datasets and potentially even new ways to prompt over previous versions. Based on the AI scaling laws, which say that compute + data + time = better models, each new generation should get more intelligence, be more capable and have better reasoning.
It is unclear what the new features will be with Gemini 2. When Gemini 1 was released, we saw multimodal abilities, including the ability to understand images or video. Google would likely expand on this and potentially include spatial data, giving it a knowledge of the world and real-world physics. We saw hints at this with Project Atlas (Gemini Live + Lens).
I think it's more likely that we will see broad improvements in terms of reasoning and reliability. We may also see some of these "thinking" capabilities opened up in the wider model. The biggest change is likely to come in the form of agents.
These are capabilities of the model that allow it to perform tasks on its own without having to rely on human input beyond the initial prompt. For example, you could tell Gemini to book your flight to Paris with certain parameters and it will go away and do it for you and just send you the tickets.
Powering agents would require the model to be able to think through a problem before taking action, similar to OpenAI's o1. So that would likely be another capability. This allows for more detailed responses, as well as improved accuracy. I also suspect Google will improve search and live data access as it comes under increasing competition from OpenAI.
More from Tom's Guide
- Android 16 could launch sooner than expected — here’s what we know
- Google Pixel 9a renders just leaked — no more camera bar
- Samsung Galaxy S25 Ultra dummy units reveal design change — what you need to know