Tagged articles

TMN-Reweight

1 articles · Page 1 of 1
Machine Heart
Machine Heart
Jun 19, 2026 · Artificial Intelligence

GoLongRL Open‑Source: 23K Samples, 9 Task Types, and the End of the Long‑Context RL Desert

GoLongRL introduces a fully open‑source long‑context reinforcement‑learning pipeline with a 23K‑sample RLVR dataset covering nine capability‑oriented tasks, a TMN‑Reweight optimizer for heterogeneous multitask training, and demonstrates SOTA performance on 4B and 30B models, surpassing leading baselines.

GoLongRLSOTA evaluationTMN-Reweight
0 likes · 13 min read
GoLongRL Open‑Source: 23K Samples, 9 Task Types, and the End of the Long‑Context RL Desert