Artificial Intelligence 6 min read

How GLM-5 Advances AI with Bigger Scale, Sparse Attention, and Agent Capabilities

GLM-5, a new large language model with 744 B parameters and 28.5 T tokens of training data, introduces DeepSeek sparse attention and an asynchronous RL system called slime, delivering strong benchmark gains on complex system engineering, long‑horizon agent tasks, and surpassing many open‑source competitors.

Open Source Tech Hub

Feb 12, 2026

How GLM-5 Advances AI with Bigger Scale, Sparse Attention, and Agent Capabilities

Model Overview

GLM‑5 is a large language model designed for complex system‑engineering and long‑horizon agent tasks. It expands the active parameter count from 32 B (GLM‑4.5) to 40 B and the total parameter count from 355 B to 744 B. The pre‑training corpus grows from 23 T to 28.5 T tokens. GLM‑5 incorporates DeepSeek Sparse Attention (DSA), which preserves long‑context capability while reducing memory and compute costs during deployment.

Asynchronous Reinforcement‑Learning Infrastructure (slime)

To close the gap between pre‑training ability and downstream performance, GLM‑5 uses slime , an asynchronous RL platform that increases training throughput and efficiency. slime enables finer post‑training iterations, allowing the model to achieve substantial gains over GLM‑4.7 on a wide range of academic benchmarks and to lead open‑source models in inference, coding, and agent tasks.

Benchmark Highlights

Evaluations on the internal CC‑Bench‑V2 suite show consistent improvements over GLM‑4.7 and narrow the gap with leading proprietary models such as Claude Opus 4.5. Selected scores (higher is better):

Humanity’s Last Exam: 30.5 (GLM‑5) vs 24.8 (GLM‑4.7)

Humanity’s Last Exam with Tools: 50.4 vs 42.8

AIME 2026 I: 92.7 vs 92.9 (GLM‑4.7)

IMOAnswerBench: 82.5 vs 82.0

HMMT Nov 2025: 96.9 vs 93.5

SWE‑bench Verified: 77.8 vs 73.8

CyberGym: 43.2 vs 23.5

Long‑Term Planning Test – Vending Bench 2

The Vending Bench 2 benchmark simulates a year‑long vending‑machine business. GLM‑5 finishes with a final account balance of $4,432.12 , closely matching Claude Opus 4.5 and demonstrating strong long‑term planning and resource‑management capabilities.

Open‑Source Release and Access

GLM‑5 is released under an MIT license on the following platforms:

Hugging Face: https://huggingface.co/zai-org/GLM-5 ModelScope: https://modelscope.cn/models/ZhipuAI/GLM-5 Model weights can be accessed via the developer APIs https://api.z.ai and https://bigmodel.cn. The repository for the slime infrastructure is available at https://github.com/THUDM/slime. The Vending Bench 2 benchmark details are at https://andonlabs.com/evals/vending-bench-2.

AI large language model Reinforcement learning benchmarking GLM-5

Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Model Overview

Asynchronous Reinforcement‑Learning Infrastructure (slime)

Benchmark Highlights

Long‑Term Planning Test – Vending Bench 2

Open‑Source Release and Access

Open Source Tech Hub

How this landed with the community

Was this worth your time?

0 Comments

Long‑Term Planning Test – Vending Bench 2