GPT-5.5 Launch: A New Agentic AI for Real‑World Work
OpenAI’s GPT‑5.5, now available via API, claims agentic capabilities that let it autonomously plan, execute, and verify complex programming, knowledge‑work, and scientific tasks while matching GPT‑5.4 latency, delivering higher benchmark scores, stronger security controls, and a tiered pricing model.
Preface
Updated 2026‑04‑24. GPT‑5.5 and GPT‑5.5 Pro are now available via the API with additional safety controls.
The model improves goal understanding, autonomous task planning, tool use, result verification, and uncertainty handling, reducing the need for step‑by‑step prompting.
OpenAI evaluated GPT‑5.5 with internal and external red‑team testing and feedback from roughly 200 trusted early users.
Model capabilities
Agentic programming
GPT‑5.5 achieves state‑of‑the‑art results on three programming benchmarks while using fewer tokens than GPT‑5.4.
Terminal‑Bench 2.0 : 82.7 % accuracy (GPT‑5.5) vs 75.1 % (GPT‑5.4).
SWE‑Bench Pro : 58.6 % (GPT‑5.5) – the highest reported score.
Expert‑SWE (internal) : 73.1 % (GPT‑5.5) vs 68.5 % (GPT‑5.4).
Across these tests GPT‑5.5 consumes fewer tokens and fewer retries.
In Codex, the model can perform end‑to‑end engineering tasks such as implementation, refactoring, debugging, testing, and verification. Early testers observed stronger system‑structure understanding, better error diagnosis, and the ability to propagate changes across large codebases.
Knowledge work
GPT‑5.5’s programming strengths extend to office workflows. It more reliably interprets intent, retrieves information, extracts key points, invokes tools, verifies results, and produces final deliverables.
In Codex it outperforms GPT‑5.4 on document, spreadsheet, and presentation generation, and excels at converting chaotic inputs into structured plans. Integrated computer‑operation abilities enable screen content recognition, clicking, typing, and switching between tools.
Internal adoption data: over 85 % of OpenAI staff use Codex weekly, cutting tax‑document processing time by two weeks and saving 5‑10 hours per week on report generation.
Scientific research
GPT‑5.5 leads on GeneBench and BixBench and has contributed to new mathematical proofs (Ramsey numbers) verified with Lean.
Researchers use the model as a “research partner” for paper review, design analysis, and multi‑turn reasoning, converting expert ideas into tools and results.
Inference efficiency
To retain GPT‑5.4 token latency while boosting performance, OpenAI redesigned the inference stack and collaborated closely with NVIDIA hardware. Codex and GPT‑5.5 themselves participate in traffic analysis and load‑balancing improvements, raising generation speed by more than 20 %.
Security enhancements
GPT‑5.5 adds stricter safety controls, extensive red‑team testing for advanced cyber‑capabilities, and a “trusted‑access” mechanism for legitimate defensive use. The model is classified as a “high‑risk capability” but remains below the “critical” level.
Availability and pricing
GPT‑5.5 is available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise tiers. GPT‑5.5 Pro targets Pro, Business, and Enterprise users.
API pricing:
Input: $5 per million tokens
Output: $30 per million tokens
GPT‑5.5 Pro pricing:
Input: $30 per million tokens
Output: $180 per million tokens
Despite higher per‑token rates, the efficiency gains make overall cost favorable.
Code example
http://www.javaedge.cn/JavaEdge
First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
