Understanding the One-Epoch Overfitting Phenomenon in Deep Click-Through Rate Models
The study reveals that industrial deep click‑through‑rate models often overfit dramatically after the first training epoch—a “one‑epoch phenomenon” caused by the embedding‑plus‑MLP architecture, fast optimizers, and highly sparse features, with performance dropping sharply unless sparsity is reduced or training is limited to a single pass.