GPT-5.5 Arrives: Faster, Stronger, Costlier – Nvidia Engineer Says Losing It Feels Like Amputation

OpenAI’s GPT-5.5, co‑designed with Nvidia’s GB200/GB300 hardware, matches GPT‑5.4’s latency while delivering higher efficiency, beating Claude Opus 4.7 across coding, knowledge‑work and math benchmarks, and even autonomously optimizes its own inference infrastructure for a 20% speed gain.

DataFunTalk
DataFunTalk
DataFunTalk
GPT-5.5 Arrives: Faster, Stronger, Costlier – Nvidia Engineer Says Losing It Feels Like Amputation

GPT‑5.5 has just been released, officially described as a new type of intelligence aimed at real‑world work and autonomous agents.

The model was jointly designed with Nvidia’s GB200 and GB300 NVL72 systems, meaning training and deployment were coordinated between model and hardware from the start.

Compared with the previous GPT‑5.4, GPT‑5.5 shows clear gains in three domains—code generation, knowledge‑work tasks, and scientific research. In token‑efficiency tests it consumes fewer tokens than GPT‑5.4 for the same tasks and achieves the same per‑token latency, while also using less token budget than Claude Opus 4.7.

On the Terminal‑Bench 2.0 suite, GPT‑5.5 scores 82.7 %, surpassing GPT‑5.4’s 75.1 % and Claude Opus 4.7’s 69.4 %.

Programming capability has leapt forward: Codex powered by GPT‑5.5 can generate a complete 3D action game in a browser using TypeScript/Three.js, automatically decomposing, executing, and checking tasks without user‑level prompting.

Early tester Dan Shipper, a startup CEO and AI product developer, fed a buggy code segment to GPT‑5.5; the model produced the same fix a senior engineer had crafted, demonstrating what he calls “concept clarity.”

In knowledge‑work benchmarks, GPT‑5.5 attains 84.9 % on the GDPval test, a 4.6‑point lead over Claude Opus 4.7. OpenAI reports that over 85 % of its staff now use Codex daily.

Scientific evaluation shows GPT‑5.5 achieving 39.6 % on the FrontierMath Tier 4 benchmark, nearly double Claude Opus 4.7’s 22.9 %. A Polish mathematics assistant professor used it to build an algebraic‑geometry visualization in 11 minutes, and an immunology professor generated a full gene‑expression analysis report for 62 samples (≈28 000 genes) that would have taken months for a team.

In pure mathematics, GPT‑5.5 discovered a new proof path for a Ramsey number problem, which was subsequently verified by the Lean formal‑verification system—an unprecedented AI contribution to core mathematics.

Beyond model improvements, OpenAI rebuilt the entire inference stack, allowing the model to rewrite its own runtime code. This self‑optimization yielded more than a 20 % increase in token‑generation speed and produced an adaptive load‑balancing partition algorithm derived from weeks of production traffic data.

Chief scientist Jakub Pachocki, speaking on a press call, noted the rapid short‑term progress and predicted even faster model releases ahead, emphasizing that the system‑level redesign marks a step toward a new way of having computers do work.

large language modelsOpenAINVIDIAMathematicsCodexAI benchmarksGPT-5.5
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.