GPT-5.5 Arrives: Faster, Stronger, Costlier—Nvidia Engineer Says Losing Access Feels Like Amputation
GPT-5.5, co‑designed with Nvidia hardware, breaks the traditional scaling‑law trade‑off by delivering higher intelligence while keeping token latency similar, achieves over 20% faster token generation, outperforms competitors across coding, knowledge‑work, and math benchmarks, and even proves new Ramsey‑number results verified by Lean.
Model positioning
GPT‑5.5 is released as a new kind of intelligence for real‑world work and autonomous agents.
Joint design with Nvidia
OpenAI and Nvidia co‑designed the model and the GB200/GB300 NVL72 accelerator systems, integrating hardware and software from training through deployment so that the model and accelerator evolve together. An email exchange between OpenAI leadership and Nvidia’s CEO confirmed the partnership.
Benchmark results
Artificial Analysis Intelligence Index shows GPT‑5.5 achieves the same score as Claude Opus 4.7 while consuming fewer tokens; with an equal token budget it completes more tasks.
Terminal‑Bench 2.0: GPT‑5.5 scores 82.7 % (GPT‑5.4 75.1 %, Claude Opus 4.7 69.4 %).
GDPval (knowledge‑work benchmark): GPT‑5.5 reaches 84.9 %, a 4.6‑point lead over Claude Opus 4.7.
FrontierMath Tier 4 (hard math benchmark): GPT‑5.5 Pro attains 39.6 % versus Claude Opus 4.7’s 22.9 %.
Ramsey‑number experiment: GPT‑5.5 discovered a new proof path that was later validated by the Lean theorem‑prover, representing an original contribution in pure mathematics.
Programming capabilities
GPT‑5.5 can autonomously decompose a coding request, execute the generated code, and verify the result, leaving the user only to review the output. By contrast, GPT‑5.4 required users to break tasks into explicit steps.
OpenAI demonstrated a 3‑D action game generated under Codex with GPT‑5.5, built in TypeScript and Three.js and featuring combat systems, enemy encounters, HUD feedback, and AI‑generated textures.
Early tester Dan Shipper (startup CEO) supplied a buggy code segment; GPT‑5.5 produced a solution identical to that of a top‑tier engineer, which GPT‑5.4 could not achieve.
Knowledge‑work and scientific workflows
Beyond code, GPT‑5.5 (via Codex) generates documents, manipulates spreadsheets, and creates PowerPoint presentations, with reported higher intent understanding.
OpenAI reports that over 85 % of its employees use Codex daily for various tasks.
GDPval benchmark result repeated: 84.9 % for GPT‑5.5 versus 80.3 % for Claude Opus 4.7.
Immunology professor Derya Unutmaz used GPT‑5.5 Pro to analyze a 62‑sample, ~28 000‑gene expression dataset and produce a full research report that would have required a team of months.
Researchers employed GPT‑5.5 to iteratively edit papers, identify logical gaps, and propose new analysis plans, with the model retaining the entire research context across dialogue turns.
Self‑optimizing inference
OpenAI rebuilt the entire inference stack, allowing the model to rewrite its own runtime infrastructure. Token‑generation speed increased by more than 20 % while per‑token latency remained comparable to GPT‑5.4.
Codex analyzed weeks of production traffic and authored an adaptive partitioning algorithm that dynamically adjusts chunk sizes based on real‑time load, improving resource utilization.
Ramsey number contribution
In a Ramsey‑number experiment, GPT‑5.5 found a new proof path for an off‑diagonal Ramsey number problem, which was subsequently verified by the Lean theorem‑prover, marking a rare original contribution in combinatorial mathematics.
Outlook
Chief Scientist Jakub Pachocki indicated that the rapid progress with GPT‑5.5 suggests a significant short‑term acceleration in model releases, contrasting with what he described as surprisingly slow advances in previous years.
Machine Learning Algorithms & Natural Language Processing
Focused on frontier AI technologies, empowering AI researchers' progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
