Tagged articles

Self‑Distillation

14 articles · Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 21, 2026 · Artificial Intelligence

xOPD Evolution: Mapping Recent OPD Improvements – Rephrased Same Problems vs. New Modules

This article surveys the latest on‑policy distillation (OPD) research, categorizing each work as either a reinterpretation of an existing problem or a modification of a different module, and highlights the experimental findings, design choices, and trade‑offs reported across the papers.

LLMModel EfficiencyOPD
0 likes · 31 min read
xOPD Evolution: Mapping Recent OPD Improvements – Rephrased Same Problems vs. New Modules
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 18, 2026 · Artificial Intelligence

From Imitation to Optimization: Recent Advances in On-Policy Distillation

This article surveys the latest research on On-Policy Distillation for large language models, covering methods that improve training stability, self‑distillation frameworks, and detailed analyses of when and why OPD succeeds or fails, with concrete experimental results and practical insights.

Entropy-AwareLarge Language ModelsOn‑Policy Distillation
0 likes · 19 min read
From Imitation to Optimization: Recent Advances in On-Policy Distillation
Machine Heart
Machine Heart
May 18, 2026 · Artificial Intelligence

ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents

The paper introduces Native Parallel Reasoner (NPR), a framework that lets language agents generate and maintain multiple reasoning paths using a three‑stage self‑distillation and parallel reinforcement‑learning training paradigm, achieving up to 4.6× speedup and significant accuracy gains across eight reasoning benchmarks.

AI reasoningLarge Language ModelsNative Parallel Reasoner
0 likes · 18 min read
ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents
Kuaishou Tech
Kuaishou Tech
May 13, 2026 · Artificial Intelligence

OneSearch‑V2 Launches: Self‑Distilled Generative Search That Truly Understands Users

OneSearch‑V2 introduces a latent‑reasoning enhanced self‑distillation framework that augments query understanding with thought‑augmented CoT, aligns preferences via direct user behavior feedback, and achieves up to 4 % CTR lift and significant order growth without adding inference cost or latency.

LLMSelf‑Distillationbehavioral feedback
0 likes · 26 min read
OneSearch‑V2 Launches: Self‑Distilled Generative Search That Truly Understands Users
Machine Heart
Machine Heart
May 13, 2026 · Artificial Intelligence

Zero‑Cost Upgrade: OneSearch‑V2 Launches Generative Search, Boosting Buyers and Orders

OneSearch‑V2 introduces a zero‑cost generative search upgrade that leverages latent‑reasoning‑enhanced self‑distillation, thought‑augmented query understanding, and behavior‑feedback preference alignment, delivering offline HitRate gains of up to 2.68 % and online CTR, buyer and order increases of roughly 4 %, 2 % and 2 % respectively.

AI rankingSelf‑Distillationbehavioral feedback
0 likes · 24 min read
Zero‑Cost Upgrade: OneSearch‑V2 Launches Generative Search, Boosting Buyers and Orders
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Apr 29, 2026 · Artificial Intelligence

Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning

The article reviews two recent Princeton papers—SD‑ZERO, which introduces self‑revision training and on‑policy self‑distillation to turn a model’s own error traces into dense supervision, and AggAgent, which actively aggregates parallel long‑horizon trajectories—showing how internal trajectory mining can cut compute costs and boost accuracy on challenging math and code benchmarks.

AggAgentComplex ReasoningLarge Language Models
0 likes · 10 min read
Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 22, 2026 · Artificial Intelligence

What Is On-Policy Distillation? A Deep Dive into On-Policy and Self-Distillation

The article explains On-Policy Distillation, derives its forward and reverse KL gradients, introduces Self‑Distillation where the policy serves as its own teacher, discusses practical implementation tricks such as extra‑knowledge injection, EMA or trust‑region teacher stabilization, and highlights benefits like reduced catastrophic forgetting, fewer Aha moments, and a narrower train‑test gap, especially for larger models.

Catastrophic ForgettingEMAKL divergence
0 likes · 6 min read
What Is On-Policy Distillation? A Deep Dive into On-Policy and Self-Distillation
HyperAI Super Neural
HyperAI Super Neural
Feb 6, 2026 · Artificial Intelligence

Latest Advances in AI Agents: PaperBanana, SDPO, Lumine, Idea2Story, and Insight Agents

This weekly roundup highlights five recent AI agent papers—PaperBanana for automated academic illustration, SDPO's self‑distillation reinforcement learning, Lumine's open‑world generalist agent, Idea2Story's pipeline for turning research ideas into narratives, and Insight Agents' fast e‑commerce insights—showcasing diverse breakthroughs in multi‑agent frameworks, self‑feedback learning, and real‑world deployment.

AI agentsMulti-Agent SystemsSelf‑Distillation
0 likes · 8 min read
Latest Advances in AI Agents: PaperBanana, SDPO, Lumine, Idea2Story, and Insight Agents
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 19, 2024 · Artificial Intelligence

M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning

This paper introduces M2SD, a dual‑branch multiple‑mixing self‑distillation framework that expands feature space, mitigates overfitting and catastrophic forgetting, and achieves state‑of‑the‑art results on CIFAR‑100, CUB‑200 and miniImageNet for few‑shot class‑incremental learning.

M2SDSelf‑Distillationclass incremental learning
0 likes · 17 min read
M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning
Meituan Technology Team
Meituan Technology Team
Sep 15, 2022 · Artificial Intelligence

YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions

The new YOLOv6 2.0 release upgrades lightweight and medium‑large models with a CSPStackRep backbone, self‑distillation, and a custom quantization pipeline, delivering up to 869 FPS for the quantized YOLOv6‑S and achieving 49.5%/52.5% AP on COCO while halving training time.

COCO benchmarkCSPStackRepQuantization
0 likes · 6 min read
YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions
AntTech
AntTech
Sep 9, 2022 · Artificial Intelligence

Ant Security Lab Wins Two Golds and One Silver at KDD Cup 2022 with Advanced Keyword Extraction and Self‑Distillation for Product Search

Ant Security Lab's algorithm engineer Lin Jinzheng secured two gold medals and one silver at the KDD Cup 2022, ranking first globally, by applying innovative keyword‑extraction and self‑distillation techniques to improve product search relevance and interactive risk‑control systems.

.aiKDD CupProduct Search
0 likes · 4 min read
Ant Security Lab Wins Two Golds and One Silver at KDD Cup 2022 with Advanced Keyword Extraction and Self‑Distillation for Product Search