Tag

pass@k

0 views collected around this technical thread.

DataFunTalk
DataFunTalk
Apr 25, 2025 · Artificial Intelligence

Does Reinforcement Learning Really Expand Reasoning Capacity in Large Language Models? Insights from Recent Empirical Study

Recent empirical research by Tsinghua’s LeapLab and Shanghai Jiao Tong University reveals that reinforcement‑learning‑based fine‑tuning (RLVR) improves sampling efficiency but does not extend the fundamental reasoning abilities of large language models beyond their base capabilities, as demonstrated across mathematics, code, and visual reasoning benchmarks.

AI researchRLVRlarge language models
0 likes · 12 min read
Does Reinforcement Learning Really Expand Reasoning Capacity in Large Language Models? Insights from Recent Empirical Study
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 30, 2023 · Artificial Intelligence

Understanding Codex: Training Framework, Evaluation Methodology, and Model Performance in ChatGPT’s Code Generation Ability

This article explains how Codex, built on the GPT‑3.5 architecture, is trained and fine‑tuned to give ChatGPT the ability to generate code, detailing the data collection, supervised fine‑tuning, evaluation using HumanEval and the pass@k metric, and presenting performance comparisons with GPT‑3 and Codex‑S.

AI model trainingChatGPTCodex
0 likes · 11 min read
Understanding Codex: Training Framework, Evaluation Methodology, and Model Performance in ChatGPT’s Code Generation Ability