Tagged articles

Self‑Distillation

14 articles · Page 1 of 1

Machine Learning Algorithms & Natural Language Processing

Jun 21, 2026 · Artificial Intelligence

xOPD Evolution: Mapping Recent OPD Improvements – Rephrased Same Problems vs. New Modules

This article surveys the latest on‑policy distillation (OPD) research, categorizing each work as either a reinterpretation of an existing problem or a modification of a different module, and highlights the experimental findings, design choices, and trade‑offs reported across the papers.

LLMModel EfficiencyOPD

0 likes · 31 min read

xOPD Evolution: Mapping Recent OPD Improvements – Rephrased Same Problems vs. New Modules

Machine Learning Algorithms & Natural Language Processing

Jun 18, 2026 · Artificial Intelligence

From Imitation to Optimization: Recent Advances in On-Policy Distillation

This article surveys the latest research on On-Policy Distillation for large language models, covering methods that improve training stability, self‑distillation frameworks, and detailed analyses of when and why OPD succeeds or fails, with concrete experimental results and practical insights.

Entropy-AwareLarge Language ModelsOn‑Policy Distillation

0 likes · 19 min read

From Imitation to Optimization: Recent Advances in On-Policy Distillation

Machine Heart

May 18, 2026 · Artificial Intelligence

ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents

The paper introduces Native Parallel Reasoner (NPR), a framework that lets language agents generate and maintain multiple reasoning paths using a three‑stage self‑distillation and parallel reinforcement‑learning training paradigm, achieving up to 4.6× speedup and significant accuracy gains across eight reasoning benchmarks.

AI reasoningLarge Language ModelsNative Parallel Reasoner

0 likes · 18 min read

ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents

Machine Heart

May 15, 2026 · Artificial Intelligence

D-OPSD: On‑Policy Self‑Distillation Lets Few‑Step Diffusion Models Learn While Running

D-OPSD presents the first online self‑distillation framework for step‑distilled diffusion models, allowing them to continuously fine‑tune with only image‑text pairs, retain their fast few‑step sampling, and acquire new concepts, styles, or domain preferences without reward models.

Diffusion ModelsLoRASelf‑Distillation

0 likes · 10 min read

D-OPSD: On‑Policy Self‑Distillation Lets Few‑Step Diffusion Models Learn While Running

Kuaishou Tech

May 13, 2026 · Artificial Intelligence

OneSearch‑V2 Launches: Self‑Distilled Generative Search That Truly Understands Users

OneSearch‑V2 introduces a latent‑reasoning enhanced self‑distillation framework that augments query understanding with thought‑augmented CoT, aligns preferences via direct user behavior feedback, and achieves up to 4 % CTR lift and significant order growth without adding inference cost or latency.

LLMSelf‑Distillationbehavioral feedback

0 likes · 26 min read

OneSearch‑V2 Launches: Self‑Distilled Generative Search That Truly Understands Users

Machine Heart

May 13, 2026 · Artificial Intelligence

Zero‑Cost Upgrade: OneSearch‑V2 Launches Generative Search, Boosting Buyers and Orders

OneSearch‑V2 introduces a zero‑cost generative search upgrade that leverages latent‑reasoning‑enhanced self‑distillation, thought‑augmented query understanding, and behavior‑feedback preference alignment, delivering offline HitRate gains of up to 2.68 % and online CTR, buyer and order increases of roughly 4 %, 2 % and 2 % respectively.

AI rankingSelf‑Distillationbehavioral feedback

0 likes · 24 min read

Zero‑Cost Upgrade: OneSearch‑V2 Launches Generative Search, Boosting Buyers and Orders

Machine Learning Algorithms & Natural Language Processing

Apr 29, 2026 · Artificial Intelligence

Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning

The article reviews two recent Princeton papers—SD‑ZERO, which introduces self‑revision training and on‑policy self‑distillation to turn a model’s own error traces into dense supervision, and AggAgent, which actively aggregates parallel long‑horizon trajectories—showing how internal trajectory mining can cut compute costs and boost accuracy on challenging math and code benchmarks.

AggAgentComplex ReasoningLarge Language Models

0 likes · 10 min read

Dual Engine for Training and Inference: How Princeton’s SD‑ZERO and AggAgent Redefine Complex Reasoning

Machine Learning Algorithms & Natural Language Processing

Feb 22, 2026 · Artificial Intelligence

What Is On-Policy Distillation? A Deep Dive into On-Policy and Self-Distillation

The article explains On-Policy Distillation, derives its forward and reverse KL gradients, introduces Self‑Distillation where the policy serves as its own teacher, discusses practical implementation tricks such as extra‑knowledge injection, EMA or trust‑region teacher stabilization, and highlights benefits like reduced catastrophic forgetting, fewer Aha moments, and a narrower train‑test gap, especially for larger models.

Catastrophic ForgettingEMAKL divergence

0 likes · 6 min read

What Is On-Policy Distillation? A Deep Dive into On-Policy and Self-Distillation

Machine Learning Algorithms & Natural Language Processing

Feb 10, 2026 · Artificial Intelligence

Why Self‑Distillation Is the 2026 Keyword for Continual Learning in Large Models

At the start of 2026, self‑distillation dominates the most cited LLM papers, offering a teacher‑free way for large models to continually acquire new knowledge while preserving existing capabilities.

Continual LearningLarge Language ModelsSelf‑Distillation

0 likes · 9 min read

Why Self‑Distillation Is the 2026 Keyword for Continual Learning in Large Models

HyperAI Super Neural

Feb 6, 2026 · Artificial Intelligence

Latest Advances in AI Agents: PaperBanana, SDPO, Lumine, Idea2Story, and Insight Agents

This weekly roundup highlights five recent AI agent papers—PaperBanana for automated academic illustration, SDPO's self‑distillation reinforcement learning, Lumine's open‑world generalist agent, Idea2Story's pipeline for turning research ideas into narratives, and Insight Agents' fast e‑commerce insights—showcasing diverse breakthroughs in multi‑agent frameworks, self‑feedback learning, and real‑world deployment.

AI agentsMulti-Agent SystemsSelf‑Distillation

0 likes · 8 min read

Latest Advances in AI Agents: PaperBanana, SDPO, Lumine, Idea2Story, and Insight Agents

AIWalker

Feb 23, 2025 · Artificial Intelligence

D-FINE Redefines Bounding-Box Regression to Reach State-of-the-Art Real-Time Detection

D-FINE introduces Fine-grained Distribution Refinement and Global Optimal Localization Self-Distillation to overhaul DETR's bounding-box regression, achieving 54‑59% AP on COCO and Objects365 at 78‑124 FPS while surpassing YOLO and RT-DETR in both accuracy and speed.

DETRReal-timeSelf‑Distillation

0 likes · 25 min read

D-FINE Redefines Bounding-Box Regression to Reach State-of-the-Art Real-Time Detection

Alibaba Cloud Big Data AI Platform

Mar 19, 2024 · Artificial Intelligence

M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning

This paper introduces M2SD, a dual‑branch multiple‑mixing self‑distillation framework that expands feature space, mitigates overfitting and catastrophic forgetting, and achieves state‑of‑the‑art results on CIFAR‑100, CUB‑200 and miniImageNet for few‑shot class‑incremental learning.

M2SDSelf‑Distillationclass incremental learning

0 likes · 17 min read

M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning

Meituan Technology Team

Sep 15, 2022 · Artificial Intelligence

YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions

The new YOLOv6 2.0 release upgrades lightweight and medium‑large models with a CSPStackRep backbone, self‑distillation, and a custom quantization pipeline, delivering up to 869 FPS for the quantized YOLOv6‑S and achieving 49.5%/52.5% AP on COCO while halving training time.

COCO benchmarkCSPStackRepQuantization

0 likes · 6 min read

YOLOv6 2.0: Enhanced Object Detection Models and Quantization Solutions

AntTech

Sep 9, 2022 · Artificial Intelligence

Ant Security Lab Wins Two Golds and One Silver at KDD Cup 2022 with Advanced Keyword Extraction and Self‑Distillation for Product Search

Ant Security Lab's algorithm engineer Lin Jinzheng secured two gold medals and one silver at the KDD Cup 2022, ranking first globally, by applying innovative keyword‑extraction and self‑distillation techniques to improve product search relevance and interactive risk‑control systems.

.aiKDD CupProduct Search

0 likes · 4 min read