What’s New in GPT‑5.5? Codex Gains Browser, Office, and Computer Automation

OpenAI released GPT‑5.5 at 2 a.m., boosting Codex with real browser control, higher‑quality Office/Drive document generation, stronger computer‑use abilities, improved token efficiency, and benchmark gains over GPT‑5.4 and Claude Opus, while detailing pricing and API access.

Node.js Tech Stack
Node.js Tech Stack
Node.js Tech Stack
What’s New in GPT‑5.5? Codex Gains Browser, Office, and Computer Automation

What is GPT‑5.5?

OpenAI describes GPT‑5.5 as "a new class of intelligence for real work". Beyond marketing language, the core changes are twofold: stronger autonomous task execution and unchanged per‑token latency, meaning the model can finish tasks with fewer tokens.

Codex Upgrade 1: Real Browser Use

"We've expanded browser use so Codex can interact with web apps, and test flows, click through pages, capture screenshots, and iterate on what it sees until it completes the task."

Codex can now open a browser, click buttons, fill forms, capture screenshots, and iteratively refine its actions until a task is finished. The demo shows a "Home Setup" onboarding flow where Codex simulates a user, checks each step’s copy and interaction, and patches missing parts.

For front‑end teams this enables two new agent capabilities:

UI regression testing : automatically run key flows after each release and compare screenshots.

Data/form smoke testing : execute login, registration, ordering, and other click‑driven workflows.

Previously this required Playwright or Puppeteer scripts; Codex now performs the actions with genuine visual understanding.

Codex Upgrade 2: Office/Drive Document Generation

"Codex now generates higher‑quality spreadsheets, slide decks, and documents with GPT‑5.5 in Microsoft Office and Google Drive. A new file viewer in the app makes it faster to review, revise, and iterate, so files are ready to share sooner."

The two main changes are:

First‑class Office and Google Drive support : Codex directly produces usable .xlsx, .pptx, and .docx files instead of rough drafts.

In‑app file viewer : users can preview, edit, and iterate on generated files without leaving the Codex interface.

The demo creates a waterfall financial model: given a template Excel and a list of investment terms, Codex outputs a complete, formula‑correct model.

Developers building AI‑generated reports, financial models, or audit documents will find this upgrade especially valuable.

Codex Upgrade 3: Stronger Computer Use

"With GPT‑5.5, Codex is stronger at using apps on your computer, from seeing what's on screen to clicking, typing, navigating, and moving context across tools."

On the OSWorld‑Verified benchmark GPT‑5.5 scores 78.7%, up from 75.0% for GPT‑5.4 and roughly matching Claude Opus 4.7 (78.0%). The key benefit is cross‑tool context transfer, e.g., opening Chrome and Notes, extracting release‑list updates, and recording them in Notes automatically.

In the short term, developers can chain everyday actions across Chrome, VS Code, terminals, and IMs with a single prompt.

Programming Benchmarks

Three coding‑related evaluations highlight GPT‑5.5’s strengths:

Terminal‑Bench 2.0 (tool coordination) : 82.7% vs 75.1% (GPT‑5.4) vs 69.4% (Claude).

Expert‑SWE (20‑hour internal task) : 73.1% vs 68.5% (GPT‑5.4); Claude not evaluated.

SWE‑Bench Pro (public) : 58.6% vs 57.7% (GPT‑5.4) vs 64.3% (Claude), with a note that the GPT‑5.5 result may be inflated by memorized evidence.

Claude still leads on SWE‑Bench Pro, but GPT‑5.5 outperforms Claude by 13 points on Terminal‑Bench and shows solid gains on Expert‑SWE.

Early User Feedback

Cursor CEO Michael Truell : "GPT‑5.5 is noticeably smarter and more resilient, with stronger coding and more reliable tool use. Most importantly it persists on tasks longer without stopping early."

Every founder Dan Shipper : "It’s the first coding model with serious conceptual clarity. A senior engineer’s multi‑day bug‑fix was reproduced by GPT‑5.5 in a single attempt."

MagicPath CEO Pietro Schirano : "Two heavily modified branches with hundreds of commits were merged back to main in 20 minutes by GPT‑5.5."

Token Efficiency

OpenAI claims GPT‑5.5 completes the same Codex tasks with significantly fewer tokens, meaning monthly costs may not rise despite higher model capability.

A technical detail: Codex and GPT‑5.5 contributed to optimizing their own inference stack. By analyzing weeks of production traffic, they wrote a custom heuristic that speeds token generation by over 20%.

API Pricing and Access

gpt‑5.5 : $5 / 1M input tokens, $30 / 1M output tokens, 1 M context window.

gpt‑5.5‑pro : $30 / 1M input, $180 / 1M output.

Batch and Flex modes cost half the standard rate; Priority mode costs 2.5×.

Within Codex the context window is 400 K tokens. Fast mode speeds generation 1.5× at 2.5× cost. All subscription tiers (Plus, Pro, Business, Enterprise, Edu, Go) can use the model. The API will be released "very soon"; ChatGPT and Codex are already usable.

Overall Assessment

GPT‑5.5 pushes the limits of agents, long‑range coding, computer interaction, and advanced mathematics, while Claude Opus 4.7 retains strengths in PR fixes, MCP toolchains, and long‑text understanding. For developers the two models now form a complementary toolbox rather than a strict choice.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsbenchmarkbrowser automationdocument generationCodexGPT-5.5
Node.js Tech Stack
Written by

Node.js Tech Stack

Focused on sharing AI, programming, and overseas expansion

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.