Tagged articles
2 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 24, 2026 · Artificial Intelligence

From Traditional RL to LLM‑RL: Theory Derivation and Engineering Improvements

The article walks through the fundamentals of traditional policy‑gradient reinforcement learning, derives the Reinforce objective, maps its concepts to large‑language‑model RL, and then discusses practical engineering solutions such as GRPO, async rollout, importance‑sampling corrections, and token‑flow management for industrial‑scale training.

Async RolloutGRPOImportance Sampling
0 likes · 10 min read
From Traditional RL to LLM‑RL: Theory Derivation and Engineering Improvements
Lobster Programming
Lobster Programming
Feb 6, 2025 · Mobile Development

How Does One‑Click Mobile Number Login Work? A Deep Dive into the Process

One‑click mobile number login streamlines user authentication by leveraging carrier‑provided phone number masks and tokens, eliminating passwords and verification codes; this article explains the underlying PPP‑based network principles, the multi‑stage token exchange flow, and integration considerations across China’s three major operators.

Mobile DevelopmentToken Flowcarrier integration
0 likes · 7 min read
How Does One‑Click Mobile Number Login Work? A Deep Dive into the Process