Tagged articles

code reasoning

3 articles · Page 1 of 1

Aug 19, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO

Klear-Reasoner, built on Qwen3‑8B‑Base, introduces the Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm to overcome traditional clip limitations, achieving state‑of‑the‑art performance on AIME2024/2025 and LiveCodeBench while providing detailed experimental analysis and data‑quality insights.

GPPOGradient ClippingLarge Language Models

0 likes · 11 min read

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO

Kuaishou Tech

Aug 18, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

The Klear‑Reasoner model, built on Qwen3‑8B‑Base and powered by the novel Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm, surpasses same‑size open‑source baselines on challenging math (AIME) and code (LiveCodeBench) benchmarks, while revealing key insights on data quality, reward design, and clipping strategies for large‑language‑model reasoning.

GPPOLLMcode reasoning

0 likes · 11 min read

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

DataFunTalk

Feb 18, 2025 · Artificial Intelligence

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning

The DeepSeek team introduced CODEI/O, a massive dataset that converts code into natural‑language reasoning chains, and demonstrated that training large language models on this data markedly improves their performance on diverse inference tasks, including non‑code domains, through a two‑stage training strategy.

CODEI/OLarge Language Modelscode reasoning

0 likes · 8 min read

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning