Artificial Intelligence 8 min read

Can OpenAI’s New o1 Model Reach Human‑Level Reasoning?

OpenAI’s newly released o1 series introduces a reinforcement‑learning‑trained LLM that generates long chain‑of‑thought reasoning, achieving top‑50% scores on IOI contests, high rankings on Codeforces and AIME, and dramatically outperforming GPT‑4o across scientific and mathematical tasks.

MaGe Linux Operations

Sep 13, 2024

Can OpenAI’s New o1 Model Reach Human‑Level Reasoning?

OpenAI has unexpectedly launched the o1 series, a new family of large language models designed for general complex reasoning by generating extensive internal chains of thought before answering.

The o1 models dramatically outperform GPT‑4o on challenging benchmarks: on PhD‑level physics problems GPT‑4o scores 59.5, while o1 reaches 92.8. In the 2024 IOI competition, a fine‑tuned o1 version achieved 213 points (top 49% of human participants) and could surpass the gold‑medal threshold with enough attempts.

Beyond contests, o1 ranks in the top 89% on Codeforces competitive‑programming problems and places among the top 500 U.S. students in the AIME pre‑selection.

Model Variants

Three versions are released:

o1 : the full, most powerful model (currently limited in public release).

o1‑preview : an early version available to ChatGPT Plus and API users.

o1‑mini : a faster, more cost‑effective model for tasks that require reasoning but less world knowledge.

All variants are trained with reinforcement learning to produce a long “thought chain” before output, a process that improves with longer reasoning time and yields a new scaling law for LLM performance.

The models excel in scientific domains: they can annotate cell‑sequencing data, generate complex quantum‑optics formulas, and automate multi‑step workflows for developers.

Human evaluations show o1‑preview outperforms GPT‑4o on reasoning‑intensive categories such as data analysis, coding, and mathematics, though it may not be optimal for some pure language tasks.

OpenAI plans to increase the inference time from seconds to minutes, hours, or even days, aiming for breakthroughs comparable to new drug discovery or solving the Riemann hypothesis.

Access

ChatGPT Plus and Team users will receive early access, with weekly usage limits (30 messages for o1‑preview, 50 for o1‑mini). API access starts with Tier 5 users (those who have spent over $1,000 on the OpenAI API).

The goal is to let future versions think for hours, days, or weeks, accepting higher inference costs for potentially transformative scientific advances.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

artificial intelligence large language model OpenAI chain of thought AI reasoning o1 model

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.