20 min read

What the 2026 AI Index Reveals About the Global AI Landscape

The 2026 AI Index report shows a dramatic shift toward industry‑driven AI breakthroughs, widening US‑China gaps, soaring carbon footprints of large models, narrowing performance gaps among top systems, booming AI investment, and growing societal concerns about responsible AI and its impact on jobs, education, and public perception.

Data Party THU

Apr 21, 2026

What the 2026 AI Index Reveals About the Global AI Landscape

Research & Development Landscape

In 2025, 91.6% of "landmark" AI models (high‑impact models) were released by the private sector, up from ~60% in 2023. The United States produced 50 landmark models, China 31, South Korea 5, and Canada, France, the UK each 1 (

Chart 1: 2025 landmark model count by country

). At the organization level OpenAI released 19 models, Google 12, Alibaba 11, Anthropic 7; Chinese firms Alibaba, DeepSeek and ByteDance also appear in the top list (

Chart 2: Landmark models by organization

Research output tells a different story: China leads globally in AI paper count, citation share (20.6% of global AI citations in 2024 vs. the US 12.6%), and patent grants. Both the US and China produced 41 of the top‑100 most‑cited papers in 2024, indicating a rapid catch‑up.

Top‑tier models are becoming less transparent; OpenAI, Anthropic and Google no longer disclose parameters, training data size, or compute budget.

AI talent inflow to the US fell 89% since 2017 and dropped 80% in the last year, the lowest net inflow in a decade (

Technical Frontiers: Performance Gaps and Carbon Costs

Training emissions have exploded. AlexNet (2012) emitted ~0.01 t CO₂, while Grok‑4 (2025) emitted 72,816 t CO₂—~1,000× the lifetime emissions of an average car (63 t). DeepSeek‑v3 emitted only ~597 t CO₂, demonstrating higher training efficiency (

Chart 4: Model training carbon emissions

Capability gaps are "jagged": Gemini Deep Think won a gold medal at the 2025 International Math Olympiad (35‑point score) but achieved only 50.1% accuracy on ClockBench, far below the ~90% typical adult performance.

Arena Leaderboard Elo scores (March 2026) show the top four models—Anthropic (1503), xAI (1495), Google (1494), OpenAI (1481)—within a 25‑point range, indicating convergence (

). The US‑China gap narrowed to a 2.7% lead for the US, fluctuating within single‑digit percentages.

Closed‑source models lead open‑source by 3.3% in Arena scores; six of the top ten are closed‑source (

Chart 6: Closed vs. open source score gap

Benchmark reliability issues: 42% of GSM8K questions are flawed; MMLU series contain 2‑26% invalid items. Arena results may reflect platform adaptation rather than true general intelligence.

Responsible AI: Incidents and Trade‑offs

AI‑related incidents rose to 362 in 2025 (up 55% from 2024). Transparency scores fell from 58 (2024) to 40 (2025); developers disclosed less about training data and compute.

Improving one responsible dimension (e.g., safety) systematically degrades another (e.g., accuracy), indicating inherent trade‑offs.

Organizationally, AI‑governance roles grew 17%, and firms without any responsible‑AI policy dropped from 24% to 11%. Main obstacles remain knowledge gaps (59%), budget limits (48%) and regulatory uncertainty (41%).

Economic Impact and Workforce Shifts

Global AI investment reached $5.817 trillion in 2025 (+130% YoY). Private capital was $3.447 trillion (+127.5% YoY); generative AI accounted for $1.709 trillion (≈50% of private AI spend) with >200% YoY growth. The US invested $2.859 trillion, 23× China’s $124 billion.

Consumer value from generative AI tools in the US hit $172 billion in early 2026 (+54% YoY). Employment effects are uneven: US software developers aged 22‑25 saw a ~20% headcount drop, while older developers grew. AI‑augmented efficiency gains of 14‑50% in support, development and marketing are offset by reduced headcount in those roles.

Scientific Advances: Small Models Outperform Giants

On the ProteinGym benchmark, MSAPairformer (1.11 B parameters) surpassed larger predecessors; GPN‑Star (2 B) beat a 400 B‑parameter model, a ~200× reduction in size with comparable performance.

New "virtual cell" models (e.g., Evo 2, AlphaGenome) aim to predict drug and gene perturbation effects without wet‑lab experiments.

AI‑related scientific papers rose 26% to ~80 k in 2025, representing 5.8‑8.8% of all research output (vs. <1% in 2010). ReplicationBench scores for frontier models were <20% on astrophysics and 33% on Earth observation, echoing the "jagged frontier" pattern.

Medical Applications: Efficiency Gains with Limited Evidence

AI‑generated clinical notes reduced documentation time by up to 83% in several hospital systems, improving clinician burnout and delivering up to 112% ROI.

A Microsoft multi‑agent system combined with OpenAI’s o3 achieved 85.5% diagnostic accuracy on complex cases, versus 20% for unaided physicians; multi‑agent setups improved accuracy by 7‑40% over single‑agent baselines.

However, a review of >500 clinical AI studies found half relied on synthetic exam‑style data; only 5% used real patient data. Of 258 AI medical devices cleared by the FDA in 2025, merely 2.4% were backed by randomized controlled trials. Moreover, 84‑92% of health‑related Google searches now display AI‑generated summaries, but systematic quality assessments are lacking.

Education: Widespread AI Use, Policy Gaps

Over 80% of US high‑school and college students use AI for research, editing and brainstorming, yet only half of secondary schools have AI policies and merely 6% of teachers consider those policies clear.

US CS undergraduate enrollment fell 11% (2024‑25) while AI‑focused graduate programs grew. AI PhD graduates in the US and Canada (2022‑24) rose 22% but all entered academia, reversing the previous industry‑absorption trend.

Globally, >90% of countries teach basic computer science in K‑12, but AI education lags except in China and the UAE, which mandated AI curricula for 2025‑26.

Policy Landscape: AI Sovereignty and Divergent Regulation

"AI sovereignty"—national control over AI capabilities—became the dominant theme of 2025 policies. Over half of new AI strategies in 2024 originated from developing nations that previously lacked AI policies.

Europe expanded AI super‑computing clusters from 3 to 44 (2018‑2025); South‑Asia, Latin America and MENA added only ~10 combined.

Regulatory divergence emerged: the EU AI Act’s first bans (predictive policing, emotion recognition) took effect, while the US issued an executive order favoring deregulation and AI leadership. Japan, South Korea and Italy passed national AI laws.

US congressional AI witnesses grew from 5 (2017) to 102 (2025); industry witnesses rose from 13% to 37%, while academia fell to 15%.

Public trust in US government AI regulation is the lowest globally at 31% (global avg 54%).

Public Opinion: Optimism Meets Anxiety

Globally, respondents who view AI as net positive rose from 55% (2024) to 59% (2025), while 52% reported increased anxiety. Experts are far more optimistic: 73% of experts see AI’s employment impact as positive vs. 23% of the public; 69% vs. 21% for economic impact; 84% vs. 44% for healthcare.

In the US, 64% expect AI to cut jobs over the next 20 years, versus only 5% who expect job creation. Experts predict AI‑augmented work will occupy 80% of US labor hours by 2030, while the public estimates just 10%.

Regional differences: Southeast Asia (China, Malaysia, Thailand, Indonesia, Singapore) shows >80% belief AI will transform life in 3‑5 years; India’s anxiety rose 14 pp. AI usage at work is highest in emerging markets (India, China, Nigeria, UAE, Saudi Arabia) with >80% employee adoption, compared to lower rates in developed economies.

Report source: https://hai.stanford.edu/ai-index/2026-ai-index-report

Code example

本文
约5000字
，建议阅读
5
分钟
关于斯坦福大学教授李飞飞联合创始的HAI团队发布的2026年人工智能发展报告《2026人工智能指数报告》的最新总结。

Industry trends model performance AI policy responsible AI AI Investment AI workforce AI Index

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.