OpenAI CEO Sam Altman kicked off this year by saying in a blog post that 2025 would be big for AI agents, tools that can automate tasks and take actions on your behalf.
Now, we’re seeing OpenAI’s first real attempt.
OpenAI announced on Thursday that it is launching a research preview of Operator, a general-purpose AI agent that can take control of a web browser and independently perform certain actions.
Operator is coming to U.S. users on ChatGPT’s $200 Pro subscription plan first. OpenAI says it plans to roll this feature out to more users in its Plus, Team, and Enterprise tiers eventually.
This initial research preview is available through operator.chatgpt.com, but soon, OpenAI says it wants to integrate Operator into ChatGPT.
The new Operator feature promises to automate tasks such as booking travel accommodations, making restaurant reservations, or shopping online, according to OpenAI. There are several task categories users can choose from within Operator – including shopping, delivery, dining, and travel – all of which enable different kinds of automation.
When ChatGPT users activate the Operator agent, a small window will pop up showing a dedicated web browser that the agent uses, along with text to explain the tasks the agent is taking. Users can still take control of their screen while Operator is working.
OpenAI says that Operator is powered by computer-using agent, or CUA, that combines the vision capabilities of the company’s GPT-4o model with the reasoning abilities from OpenAI’s more advanced models. The CUA is trained to interact with the front-end of websites, meaning it doesn’t need to use developer-facing APIs to tap into different services.
In other words, the CUA can use buttons, navigate menus, and fill out forms on a webpage — much like a human would.
“The CUA model is trained to ask for user confirmation before finalizing tasks with external side effects, for example before submitting an order, sending an email, etc., so that the user can double-check the model’s work before it becomes permanent,” OpenAI writes in materials provided to TechCrunch. “[It] has already proven useful in a variety of cases, and we aim to extend that reliability across a wider range of tasks.”
OpenAI says it’s collaborating with companies like DoorDash, Instacart, Priceline, StubHub, and Uber to ensure Operator respects the norms of these businesses.
But OpenAI warns that CUA isn’t perfect. The company says it “[doesn’t] expect CUA to perform reliably in all scenarios just yet.”
Out of an abundance of caution, OpenAI is also requiring supervision for some tasks, like banking transactions, CUA and Operator might be able to perform entirely on their own.
“On particularly sensitive websites, such as email, Operator requires active user supervision, ensuring users can directly catch and address any potential mistakes the model might make,” OpenAI says in its materials.
Operator appears to be OpenAI’s boldest attempt yet at creating an AI agent. Last week, OpenAI released Tasks, giving ChatGPT simple automation features such as the ability to set reminders and schedule prompts to run at a set time every day. Tasks gave ChatGPT users some familiar, but necessary, features to make ChatGPT as practical to use as Siri or Alexa. However, Operator shows off capabilities that the previous generation of virtual assistants could never do.
The rise of AI agents has been promised as the next “ChatGPT moment”: a new technology that will change how we use the internet. Instead of just delivering and processing information, agents are supposed to really take actions and do things. As OpenAI release its first real version, we may be starting to see if that vision comes true.
Maxwell Zeff is a senior reporter at TechCrunch specializing in AI and emerging technologies. Previously with Gizmodo, Bloomberg, and MSNBC, Zeff has covered the rise of AI and the Silicon Valley Bank crisis. He is based in San Francisco. When not reporting, he can be found hiking, biking, and exploring the Bay Area’s food scene.