OpenAI has begun previewing a new tool called Operator that can navigate within a web browser. According to a blog post published Thursday, the software is powered by what the company calls a Computer-Using Agent. “CUA is trained to interact with graphical user interfaces (GUIs) — the buttons, menus, and text fields people see on a screen — just as humans do,” says OpenAI of the model. “This gives it the flexibility to perform digital tasks without using OS- or web-specific APIs.“
The current release of Operator builds on OpenAI’s GPT-4o model. It combines the vision capabilities of that algorithm with “advanced reasoning” trained through reinforcement learning. Operator has the ability to “break tasks into multi-step plans and adaptively self-correct when challenges arise.” According to OpenAI, that capability represents the next stage in AI development.
As with past research previews, OpenAI warns that Operator is “still early and has limitations,” and that it won’t “perform reliably in all scenarios just yet.” For instance, depending on the complexity of the task and interface involved, the agent greatly benefits from the user taking a few extra moments to write a more detailed prompt. Per The Verge, Operator will give the user control if it ever gets stuck on a task. It will also hand control over whenever a website asks for sensitive information, including login credentials. The company says it designed the tool to “refuse harmful requests and block disallowed content.”
OpenAI is making Operator first available to users of its $200 per month ChatGPT Pro subscription. It is also partnering with companies like Instacart to offer the agent on their platforms, though there again you’ll need a ChatGPT Pro subscription to test the integration.
Operator joins a growing list of AI agents that can either navigate a web browser or an entire operating system. Anthropic was the first to offer the capability with the release of its Claude 3.5 Sonnet model in October, followed more recently by Google with its Gemini 2.0 model and Project Mariner.
If you buy something through a link in this article, we may earn commission.
Trending Products
![cimetech EasyTyping KF10 Wireless Keyboard and Mouse Combo, [Silent Scissor Switch Keys][Labor-Saving Keys]Ultra Slim Wireless Computer Keyboard and Mouse, Easy Setup for PC/Laptop/Mac/Windows – Grey](https://m.media-amazon.com/images/I/415Vb6gl+PL._SS300_.jpg)
cimetech EasyTyping KF10 Wireless Keyboard and Mouse Combo, [Silent Scissor Switch Keys][Labor-Saving Keys]Ultra Slim Wireless Computer Keyboard and Mouse, Easy Setup for PC/Laptop/Mac/Windows – Grey

AOC 22B2HM2 22″ Full HD (1920 x 1080) 100Hz LED Monitor, Adaptive Sync, VGA x1, HDMI x1, Flicker-Free, Low Blue Light, HDR Ready, VESA, Tilt Adjust, Earphone Out, Eco-Friendly

TopMate Wireless Keyboard and Mouse Ultra Slim Combo, 2.4G Silent Compact USB Mouse and Scissor Switch Keyboard Set with Cover, 2 AA and 2 AAA Batteries, for PC/Laptop/Windows/Mac – White

HP 2024 Laptop | 15.6″ FHD (1920×1080) Display | Core i3-1215U 6-Core Processor | 32GB RAM, 1.5TB SSD(1TB PCIe & P500 500GB External SSD) | Fingerprint Reader | Windows 11 Pro

Thermaltake View 200 TG ARGB Motherboard Sync ATX Tempered Glass Mid Tower Computer Case with 3x120mm Front ARGB Fan, CA-1X3-00M1WN-00

SAMSUNG FT45 Sequence 24-Inch FHD 1080p Laptop Monitor, 75Hz, IPS Panel, HDMI, DisplayPort, USB Hub, Peak Adjustable Stand, 3 Yr WRNTY (LF24T454FQNXGO),Black

Dell Inspiron 15 3520 15.6″ FHD Laptop, 16GB RAM,1TB SSD, Intel Core i3-1215U Processor(Beat i5-1135G7), SD Card Reader, WiFi, Bluetooth, Webcam, Win 11 Home, Alpacatec Accessories, Carbon Black
![Dell Inspiron 15 3000 3520 Business Laptop Computer[Windows 11 Pro], 15.6” FHD Touchscreen, 11th Gen Intel Quad-Core i5-1135G7, 16GB RAM, 1TB PCIe SSD, Numeric Keypad, Wi-Fi, Webcam, HDMI, Black](https://m.media-amazon.com/images/I/51O3nNfyJPL._SS300_.jpg)