PaperAgent
PaperAgent
Feb 22, 2026 · Artificial Intelligence

How Skipping 50% of Gradient Updates Supercharges LLM Training (SkipUpdate & Magma)

A recent Google‑Northwestern study reveals that randomly discarding half of parameter updates during training—implemented as the SkipUpdate strategy—consistently outperforms dense optimizers across Llama models, and its extension Magma adds momentum‑gradient alignment to achieve further gains, offering a zero‑overhead, geometry‑aware regularization for large‑scale LLMs.

MagmaOptimizationSkipUpdate
0 likes · 9 min read
How Skipping 50% of Gradient Updates Supercharges LLM Training (SkipUpdate & Magma)
AIWalker
AIWalker
Aug 3, 2025 · Artificial Intelligence

Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study

A collaborative team from five universities proposes a tree-structured convolutional neural network that leverages binary‑tree guidance, cosine cross‑domain extraction, and an adaptive Nesterov momentum optimizer to markedly improve image super‑resolution performance.

adaptive optimizercomputer visiondeep learning
0 likes · 5 min read
Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study