Tagged articles

SAME

1 articles · Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jul 1, 2026 · Artificial Intelligence

SAME: Stabilizing MoE to Reduce Dual Forgetting in Multimodal Continual Instruction Tuning

The paper identifies routing drift and expert drift as the two main causes of forgetting in multimodal continual instruction tuning (MCIT) and proposes SAME, which combines spectral‑aware routing, curvature‑aware scaling, and adaptive expert activation to keep MoE models stable, efficient, and less forgetful across long task sequences.

Continual LearningICML 2026Instruction Tuning
0 likes · 19 min read
SAME: Stabilizing MoE to Reduce Dual Forgetting in Multimodal Continual Instruction Tuning