Enabling Search Agents to Think While Waiting: Diffusion LLMs Deliver 15% Faster Inference Without Accuracy Loss

The paper introduces DLLM‑Searcher, which equips diffusion large language models with a two‑stage training pipeline and a P‑ReAct inference scheme, allowing the model to issue tool calls while simultaneously reasoning, yielding 14‑22% end‑to‑end speedup and matching or surpassing traditional autoregressive agents on multi‑hop QA benchmarks.

Multi-hop QAP-ReActParallel Reasoning

0 likes · 10 min read

Enabling Search Agents to Think While Waiting: Diffusion LLMs Deliver 15% Faster Inference Without Accuracy Loss

Kuaishou Tech

Oct 11, 2025 · Artificial Intelligence

How KAT-Dev-72B-Exp Sets a New Record in Large‑Scale RL for Code Generation

The KAT‑Dev‑72B‑Exp model, an experimental reinforcement‑learning version of KAT‑Coder, achieves a 74.6% performance boost on the SWE‑Bench Verified benchmark, introduces Trie Packing and entropy‑aware advantage scaling, and showcases a decoupled training architecture that dramatically speeds up large‑scale agentic RL training.

AICode GenerationReinforcement Learning

0 likes · 9 min read

How KAT-Dev-72B-Exp Sets a New Record in Large‑Scale RL for Code Generation