Model Ability Gets Squeezed Out in Multi‑Task Learning—How ESM Preserves It (CVPR 2026)
The paper reveals that multi‑task models suffer performance drops because tasks compete for the same internal subspace, and introduces Essential Subspace Merging (ESM) which separates critical directions and uses Polarized Scaling to keep multiple abilities stable, achieving significantly lower degradation than traditional baselines.
