Machine Heart
Apr 3, 2026 · Artificial Intelligence
Beyond Token Entropy: ReLaX Uses Latent Dynamics to Rethink Exploration‑Exploitation in LLM RL
The paper introduces ReLaX, a framework that shifts focus from token‑level entropy to the latent‑space dynamics of large models, employing Koopman operators and a Dynamic Spectral Divergence metric to quantitatively guide exploration‑exploitation balance, and demonstrates state‑of‑the‑art performance on both pure‑text and multimodal RL benchmarks.
Koopman operatorReLaXdynamic spectral divergence
0 likes · 12 min read
