NVIDIA GTC 2025 Keynote Unpacked: 13 Major Announcements & $1 Trillion AI Demand Forecast
In a two‑hour keynote, Jensen Huang reviewed CUDA’s 20‑year flywheel, introduced DLSS 5 neural rendering, forecast a $1 trillion AI demand by 2027, unveiled the 3.6 EFLOPS Vera Rubin platform, integrated Groq LPX for decoupled inference, and announced a suite of AI hardware, software, and ecosystem initiatives.
CUDA 20‑Year Flywheel
Jensen Huang highlighted the flywheel effect of CUDA: a large installation base attracts developers, leading to algorithm breakthroughs, new markets, and further expansion of the install base. The Ampere architecture’s cloud pricing increase was cited as evidence of platform vitality.
"We spent 20 years deploying hundreds of millions of GPUs and compute systems worldwide. CUDA’s install base is the core of our flywheel,"
He summarized NVIDIA’s strategy as being an algorithm company.
DLSS 5: Neural Rendering
DLSS 5 was presented as neural rendering that fuses structured 3D graphics data with generative AI, delivering controllable yet photorealistic output.
Core concept: Structured data + generative AI
Two foundational libraries were announced:
cuDF – accelerates structured data frames.
cuVS – accelerates vector storage and unstructured data.
IBM used cuDF to accelerate Watson X Data, enabling a 5× speedup in supply‑chain data processing and an 83 % cost reduction for Nestlé.
Three AI Inflection Points
ChatGPT (late 2022) – generative AI shifted computation from retrieval‑based to creation‑based.
O1 (2024) – reasoning AI introduced trustworthy, traceable, evidence‑based behavior.
Claude Code – agent AI enables code writing, debugging, pull‑request submission, and merging.
"Claude Code completely transforms software engineering. 100 % of NVIDIA employees use Claude Code, Codex, and Cursor,"
AI compute demand was said to have grown roughly one million‑fold over the past two years, marking a shift to an inference‑centric era.
Token Economics: Data Centers as Token Factories
2025 was declared NVIDIA’s inference year. The Grace Blackwell NVLink 72 architecture delivers a 35–50× per‑watt performance improvement (analyst Dylan Patel notes the gain may be 50×).
"Every CEO will study tokens per watt; without the right architecture, even free compute is not cheap enough." "I see at least $1 trillion of demand by 2027" (previous forecast: $500 billion for 2026).
Vera Rubin Platform
The Vera Rubin platform provides:
3.6 EFLOPS of compute.
260 TB/s full‑mesh NVLink bandwidth.
Seven chips across five racks (a scale computer).
100 % liquid cooling with 45 °C water.
Installation time reduced to two hours (previously two days).
Projected 40 million‑fold compute growth over ten years.
The Vera Rubin Ultra variant adds a Kyber rack that packs 144 GPUs into a single NVLink domain; Microsoft Azure has launched the first Vera Rubin rack.
Vera CPU, based on LPDDR5, offers double the per‑watt performance of any CPU, positioned as a potential multi‑billion‑dollar business.
Groq Integration: Decoupled Inference
Groq LPX, a deterministic data‑flow processor with large SRAM, was integrated into Vera Rubin to separate pre‑fill (handled by Vera Rubin) from token generation (handled by Groq).
Highest‑value token generation layer: 35× throughput boost.
Token generation rate: 2 M → 700 K tokens per second, a 350× increase over two years.
Revenue scaling: Blackwell 5× Hopper, Vera Rubin 5× Blackwell, further increased with Groq.
NVIDIA introduced Dynamo , an AI‑factory operating system that schedules inference tasks between Vera Rubin and Groq.
Roadmap: Blackwell → Rubin → Rubin Ultra → Feynman
Future architecture roadmap:
Rubin : optical NVLink 576, Oberon architecture.
Rubin Ultra : Kyber rack, LP35 LPU (supports NVFP4).
Feynman : new GPU, LP40 LPU, Rosa CPU, BlueField 5, CX‑10, Spectrum 6.
Open Claw: Operating System for Agent‑Based Computing
Open Claw, created by Peter Steinacker, was described as the operating system for agent computers, comparable in importance to HTML and Linux.
"Open Claw open‑sources the operating system for agent computers. This is as important as HTML and Linux. Every company will need an Open Claw strategy." "Enterprise security is a critical challenge for agents."
NVIDIA released Nemo Claw , an enterprise‑grade security reference design that includes Open Shell, a policy engine, network guardrails, and privacy routing.
Foundation Models
NVIDIA announced Nemotron 3 Ultra , a foundation model for sovereign AI and fine‑tuning, and the Nemotron Coalition partnership with Mistral, Perplexity, Cursor, and LangChain.
Physical AI and Robotics
Three classes of computers for robotics were defined: training, simulation & synthetic data, and onboard inference.
Four new Robotaxi partners were added – BYD, Hyundai, Nissan, and Geely – bringing total coverage to 18 million vehicles per year, with Uber handling multi‑city deployments.
"The ChatGPT moment for autonomous driving has arrived."
Alpha Maya was presented as the world’s first thinking and reasoning autonomous‑driving AI.
The Disney Olaf robot, built with the NVIDIA Newton physics engine and deep‑reinforcement learning in Omniverse, demonstrated free walking, gestures, and interaction on stage.
Cloud Ecosystem Integration
Deep collaborations with major cloud providers were highlighted:
Google Cloud – Vertex AI, BigQuery acceleration (Snapchat cost reduction 80 %).
AWS – SageMaker, Bedrock (OpenAI on AWS).
Microsoft Azure – AI Foundry (OpenAI, Anthropic, Synopsis).
Oracle Cloud – Coherent, Fireworks, OpenAI.
CoreWeave – AI‑native cloud.
Palantir + Dell – on‑premise/air‑gapped AI deployments.
Nokia + T‑Mobile – aerial AI RAN.
Approximately 100 new CUDA X libraries, 70 library updates, and 40 new models were announced.
Additional Announcements
Vera Rubin Space 1 – a space‑based data center concept.
NVIDIA DSX – an AI‑factory digital twin platform based on Omniverse, claimed to double efficiency.
Open Claw and Nemo Claw – agent operating system and security reference.
Six open model families: Nemotron (language & reasoning), Cosmos (physics/world models), Alpha Maya (autonomous driving), GR00T (general robotics), BioNemo (biology/chemistry), Earth‑2 (weather/climate).
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
