GPT‑5.5 Instant Goes Free: Hallucinations Cut 52%, Math Scores Jump to 81%, and Personalized Memory Arrives

OpenAI has rolled out GPT‑5.5 Instant as the new default ChatGPT model, delivering 52.5% fewer hallucinations, a rise in math benchmark scores from 65% to 81%, 30% shorter replies, and a memory system that surfaces past context for personalized answers, all available for free to every user.

Top Architect
Top Architect
Top Architect
GPT‑5.5 Instant Goes Free: Hallucinations Cut 52%, Math Scores Jump to 81%, and Personalized Memory Arrives

OpenAI announced the launch of GPT‑5.5 Instant , which immediately replaces GPT‑5.3 Instant as the default model in ChatGPT and is free for all users. The upgrade focuses on three major improvements: more concise answers, stronger memory, and deeper personalization.

Performance gains are highlighted by a comprehensive benchmark suite. On the AIME 2025 math test the accuracy rose from 65.4% to 81.2%; on the GPQA scientific reasoning benchmark it increased from 78.5% to 85.6%; and on the multimodal MMMU‑Pro test it climbed from 69.2% to 76.0%.

The most striking change is a 52.5% reduction in hallucination rate for high‑risk domains such as medicine, law, and finance, compared with the 20% improvement seen in the previous generation. User‑reported “fact‑error” instances dropped by 37.3% in difficult dialogues. An illustrative example shows GPT‑5.5 Instant initially echoing a wrong answer, then detecting the inconsistency, correcting the substitution error, and delivering the correct solution using the quadratic formula, whereas GPT‑5.3 Instant stopped after declaring “no real solution”.

In addition to accuracy, response length shrank dramatically: official data report a 30.2% reduction in token count and a 29.2% reduction in line count. For routine queries (e.g., how to tell a colleague to stop nagging) the new model provides a single, targeted suggestion instead of a list of five strategies and multiple cautions.

Personalization is enabled through the new “Memory Sources” feature, which can pull relevant past chats, uploaded files, or linked Gmail content to tailor replies. Users can view which memories were used, delete or edit outdated entries, and the system ensures that shared conversations do not expose private memory sources. This design follows a three‑step loop: the model remembers you, shows you what it remembered, and lets you decide what to forget.

The upgrade also introduces a model identifier for the API : chat‑latest. OpenAI warns that the memory‑source view may not capture every factor influencing a response, and recommends developers begin regression testing early because past transitions (e.g., GPT‑4o retirement) caused production failures.

Deployment is staged: free users receive GPT‑5.5 Instant immediately, paid users can manually stay on GPT‑5.3 Instant for three months before it is retired. Enhanced personalization (memory, file, Gmail integration) is currently limited to Plus and Pro web users, with mobile rollout forthcoming. The change affects billions of ChatGPT users, turning incremental model improvements into a fundamental shift in everyday AI interaction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ChatGPTOpenAIAI benchmarksmodel updatehallucination reductionGPT-5.5memory sources
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.