Ever since I first saw the “Welcome home, sir” scene in Iron Man 2, I’ve wanted a smart setup with a Jarvis-like assistant. I hoped Alexa would provide that kind of functionality, but so far, the assistant is just too limited. That might change with the launch of Gemini 2.0 and Google’s Project Jarvis.
In a sense, this new project is Jarvis. The system works by taking stills of your screen and interpreting the information on it, including the text, images, and even sound. It can auto-fill forms or press buttons for you, too. This project was first hinted during Google I/O 2024, and according to 9to5Google, is designed to automate web-based tasks. “Jarvis” is an AI agent with a narrow focus than a language learning model like ChatGPT — an AI that demonstrates human-like powers of reasoning, planning, and memory.
Imagine if you could ask Jarvis to research the cheapest flight for an upcoming trip or keep an eye out for online listings of a vintage game console you’d like to buy. This project has the potential to simplify everyday tasks and take the tedium out of many online chores.
The system is still in the early stages of development, though. The predominant theory is that Jarvis will be powered by Gemini 2.0, and it could release as early as December of this year for testing. Jarvis would act as an example of what Gemini is capable of, rather than a full feature. With talk that Jarvis will become available for early testers, a public launch doesn’t seem likely anytime soon.
That’s further reinforced by the news that Jarvis doesn’t always process information quickly, implying that it will need the cloud for some time yet before it’s able to operate at regular speed on-device. With a new or upgraded AI model everytime we turn around, it’s easy to see why Google is stepping up its game. The company needs to be able to complete with OpenAI and GPT-5.