Weight Decay — 2 Technical Articles

Aug 15, 2023 · Artificial Intelligence

Why Do Neural Networks Suddenly ‘Grok’ After Long Training? Insights from Google

Google’s recent research reveals that when small neural networks are trained for extended periods on tasks like modular addition, they can abruptly shift from memorizing training data to genuinely generalizing—a sudden “grokking” phenomenon driven by weight decay and the emergence of periodic weight structures.

AI researchGeneralizationMLP

0 likes · 9 min read

Why Do Neural Networks Suddenly ‘Grok’ After Long Training? Insights from Google

Code DAO

Jun 3, 2022 · Artificial Intelligence

Decomposing PointGAN: Teaching a Machine to Generate a Single Point

This article walks through building and analyzing a minimal GAN—PointGAN—that learns to output the single value 1, covering the linear generator, a two‑layer discriminator, training loops, loss visualizations, instability diagnostics, and practical fixes such as loss easing, weighted examples, weight decay, and noisy generator parameters.

DiscriminatorGANGenerator

0 likes · 24 min read

Decomposing PointGAN: Teaching a Machine to Generate a Single Point