Hy3 Preview: First Post‑Rebuild Model with Dramatically Boosted Agent Capabilities

Tencent releases and open‑sources Hy3 preview, a 295‑billion‑parameter mixed‑expert LLM supporting 256K context, built on rebuilt pre‑training and RL infrastructure and guided by three principles—systematic capability, authentic evaluation, and cost efficiency—delivering strong gains in complex reasoning, context learning, code and agent tasks, and is already deployed across multiple Tencent products.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Hy3 Preview: First Post‑Rebuild Model with Dramatically Boosted Agent Capabilities

Model Overview

Hy3 preview is a mixed‑expert large language model with 295 B total parameters , 21 B activation parameters and a maximum context length of 256 K tokens . The model is the first released after a complete reconstruction of the Hy series infrastructure.

Infrastructure Reconstruction and Design Principles

In February the pre‑training and reinforcement‑learning pipelines were rebuilt. The development team defined three guiding principles for practical usefulness:

Ability systematization : avoid specialization by ensuring deep collaboration among reasoning, long‑text handling, instruction following, dialogue, code, and tool use.

Authentic evaluation : supplement public leaderboards with internal test sets, recent exams, human assessments and product‑level crowdsourced testing.

Cost‑performance focus : co‑design model architecture and inference framework to dramatically lower task costs.

Benchmark Performance

Complex Reasoning

Hy3 preview achieves strong results on high‑difficulty scientific benchmarks, including FrontierScience Olympiad , IMO Answer Bench , Tsinghua University’s spring math exam (2026) and the national high‑school biology competition (CHSBO 2025), demonstrating generalized reasoning strength.

Complex reasoning benchmark results
Complex reasoning benchmark results

Context Learning and Instruction Following

Newly introduced CL‑bench and CL‑bench‑Life evaluate the model’s ability to handle noisy, long contexts and obey complex, changing rules. Hy3 preview shows significant gains over previous generations.

Context learning benchmark
Context learning benchmark

Code Generation and Agent Capabilities

Rebuilt training framework and larger RL task scale lead to competitive results on major code‑agent benchmarks such as SWE‑Bench Verified and Terminal‑Bench 2.0 , as well as search‑agent benchmarks BrowseComp and WideSearch . Comprehensive agent evaluations ( ClawEval , WildClawBench ) confirm practical utility in complex agent workflows.

Code and agent benchmark results
Code and agent benchmark results

Internal Evaluation Suites

Additional internal benchmarks – Hy‑Backend , Hy‑Vibe Bench and the high‑difficulty software‑engineering suite Hy‑SWE Max – show strong competitiveness across backend engineering, user‑interaction and challenging software development tasks.

Internal benchmark comparison
Internal benchmark comparison

Performance Metrics in Production

Latency and success‑rate measurements on internal products report:

First‑token latency reduced by 54 % and end‑to‑end latency reduced by 47 %.

Success rate exceeding 99.99 %.

Stable execution of agent workflows up to 495 steps.

Open‑Source Release and Inference Support

Model weights and code are publicly released on the following platforms:

GitHub: https://github.com/Tencent-Hunyuan/Hy3-preview
Hugging Face: https://huggingface.co/tencent/Hy3-preview
ModelScope: https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview
GitCode: https://ai.gitcode.com/tencent_hunyuan/Hy3-preview

The model is compatible with major inference engines such as vLLM and SGLang . Architecture, operator and quantization optimizations reduce inference cost compared with the previous generation.

API Pricing

On Tencent Cloud a competitive API pricing and customizable Token Plan are offered; the personal tier starts at ¥28 per month.

Known Issues and Future Work

The team acknowledges remaining issues and invites community feedback to guide the upcoming official release. Ongoing efforts focus on scaling pre‑training and reinforcement‑learning data to further improve the model’s intelligence ceiling.

open-sourceLarge Language ModelbenchmarkTencent AIagent capabilitiesHy3-preview
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.