Artificial Intelligence 8 min read

How Ring-1T Achieves Trillion-Scale Deep Thinking and Competitive Benchmarks

The Ring-1T model, a trillion-parameter AI system released as open source, leverages advanced reinforcement learning techniques, extensive benchmark evaluations, and custom training frameworks to deliver balanced performance across math, code, reasoning, and creative tasks while highlighting current limitations and future development plans.

AntTech

Oct 14, 2025

How Ring-1T Achieves Trillion-Scale Deep Thinking and Competitive Benchmarks

Ring-1T Model Release

Today we officially release the trillion-parameter thinking model Ring-1T, open‑source at launch. Developers can download weights from Hugging Face or Magic Community, and experience chat or API via Ling Chat and ZenMux.

Building on the preview version, we expanded large‑scale verifiable reward reinforcement learning (RLVR) and used RLHF to improve general abilities, resulting in more balanced performance across tasks.

Ring-1T follows the Ling 2.0 architecture, trained on a 1 T total‑parameter, 50 B activation‑parameter Ling‑1T‑base base, supporting up to 128 K context window. With our self‑developed stable RL training method icepop and efficient RL system ASystem (AReaL framework open‑source), we achieved stable MoE‑based RL scaling from hundred‑billion to trillion parameters, greatly enhancing deep reasoning.

Evaluation on High‑Difficulty Benchmarks

We compared Ring‑1T with open‑source models (Ring‑1T‑preview, Deepseek‑V3.1‑Terminus‑Thinking, Qwen‑235B‑A22B‑Thinking‑2507) and closed‑source APIs (Gemini‑2.5‑pro, GPT‑5‑Thinking). Ring‑1T leads on math competitions (AIME 25, HMMT 25), code generation (LiveCodeBench, CodeForce), logical reasoning (ARC‑AGI‑1), and shows strong results on comprehensive tasks such as Arena‑Hard‑v2.0, HealthBench, and Creative Writing v3.

In the IMO 2025 test via the multi‑agent framework AWorld, Ring‑1T solved four problems on the first attempt (silver‑level) and nearly full marks on a geometry problem, missing only the hardest problem. In the ICPC World Finals 2025, Ring‑1T solved five out of six problems, outperforming GPT‑5‑Thinking and Gemini‑2.5‑pro.

icepop: Stabilizing Long‑Cycle RL Training

In MoE RL training, operator differences between training and inference grow with sequence length. icepop introduces a masked bidirectional truncation technique to align distributions, reducing the training‑inference gap.

ASystem: Self‑Developed RL Framework for Trillion‑Scale Training

ASystem uses a SingleController + SPMD architecture, a unified memory pool for transparent unloading, zero‑redundancy weight exchange, and a serverless sandbox providing millisecond‑level startup for over 10 K /s request throughput. The AReaL component is open‑source.

Handcrafted Demo Cases

Ring‑1T also excels in visual and frontend tasks, with demos such as ball motion, solar system simulation, fireworks, building demolition 3D, memory‑match game, and the classic farmer‑wolf‑goat‑cabbage puzzle.

Limitations and Future Plans

Current issues include occasional identity bias, language mixing, and repetitive generation, as well as sub‑optimal long‑context efficiency due to the Ling 2.0 GQA attention scheme. Ongoing training will address these, and we welcome community feedback.

Visit our open‑source repositories and demo pages to download and try Ring‑1T.

Large Language Model AI model reinforcement learning benchmark evaluation deep reasoning trillion parameters