Why Does ChatGPT Suddenly ‘Think Step‑by‑Step’? Unveiling the Chain‑of‑Thought Emergence
The article explains how ChatGPT’s surprising step‑by‑step reasoning, known as Chain of Thought, emerged as a technical breakthrough, links it to model scaling, cognitive System 1/2 theory, and the influence of code data in training large language models.
Since its release, ChatGPT has maintained high popularity, and its impact is not only due to better dialogue performance but also because it exhibits “emergent” capabilities from a technical perspective.
The most representative emergent ability is the “Chain of Thought” (CoT), which enables step‑by‑step reasoning. An example shows GPT‑3 failing a simple math problem, but when prompted with “let’s think step by step”, it produces the correct answer and a full reasoning trace.
These abilities are not explicitly trained; they appear when researchers add prompts that trigger multi‑step reasoning, suggesting a sudden emergence.
Scaling studies across many datasets reveal a common pattern: model accuracy stays low until a certain size threshold, after which performance on tasks requiring multi‑step reasoning (e.g., math word problems) jumps sharply, while tasks solvable by fast intuition (System 1) improve smoothly.
The phenomenon aligns with Kahneman’s System 1 / System 2 theory: emergent abilities correspond to System 2‑type tasks that need deliberate, sequential reasoning.
One hypothesis links the emergence to the inclusion of code in training data, because code often contains explicit multi‑step logic, teaching models to perform step‑by‑step inference.
Although this remains a hypothesis, ongoing research aims to clarify why code data triggers the Chain‑of‑Thought capability and how it relates to human System 2 reasoning.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
