Cloud Computing 20 min read

The Road to Billions of AI Agents: Key Takeaways from Matt Garman’s re:Invent 2025 Keynote

At AWS re:Invent 2025, CEO Matt Garman outlined four essential pillars for building AI agents, unveiled three frontier agents, introduced the Amazon Nova 2 model series and 25 major cloud service innovations, and argued that billions of agents will soon deliver ten‑fold efficiency gains across enterprises.

Amazon Cloud Developers
Amazon Cloud Developers
Amazon Cloud Developers
The Road to Billions of AI Agents: Key Takeaways from Matt Garman’s re:Invent 2025 Keynote

Keynote Overview

During the second day of re:Invent 2025, Amazon Web Services CEO Matt Garman delivered a keynote titled “How AWS Is Redefining the Future of Cloud Technology,” where he presented a strategic view of Agentic AI and announced a suite of new hardware, models, agents, and cloud services.

Four Core Elements for AI Agents

AI Infrastructure – AWS introduced the Amazon Trainium 3 UltraServers, the first 3 nm AI‑chip servers, delivering 4.4× higher compute performance, 3.9× memory‑bandwidth, and a 5× increase in tokens processed per megawatt compared with Trainium 2. The largest configuration packs 144 chips for 362 PFLOPS (FP8) and runs the GPT‑OSS‑120B model with >5× token‑per‑MW efficiency. AWS also previewed the upcoming Trainium 4, promising six‑fold FP4 performance, four‑fold bandwidth, and double the memory capacity.

Inference System – Amazon Bedrock now serves twice as many customers year‑over‑year, with over 50 customers processing more than 1 trillion tokens. Bedrock offers a broad model catalog, custom‑model fine‑tuning, integrated data tools, safety guardrails, and deep integration with other AWS services.

Data – The keynote stressed that proprietary enterprise data must remain on‑premise or in private clouds to become a competitive advantage, noting that third‑party models cannot be deeply adapted to specific business datasets.

Building Tools – AWS unveiled AgentCore (a modular system for building, deploying, and operating agents), the open‑training framework Amazon Nova Forge, and supporting components such as AgentCore Policy (real‑time deterministic control via the Cedar language) and AgentCore Evaluations (13 preview evaluators for continuous quality assurance).

New Hardware Highlights

In addition to Trainium 3, AWS announced the upcoming Trainium 4 chip, which will deliver six‑fold FP4 compute, four‑fold memory bandwidth, and double the memory capacity of Trainium 3. Over 1 million Trainium 2 chips have already been deployed to power Amazon Bedrock workloads, including Claude’s latest generation.

Amazon Nova 2 Model Series

Nova 2 Lite – Fast, cost‑effective inference model that matches Claude Haiku 4.5, GPT‑5 Mini, and Gemini 2.5 Flash on instruction following, tool use, code generation, and document extraction.

Nova 2 Pro – Handles highly complex workloads and outperforms GPT‑5.1 and Gemini 3 Pro on instruction following and intelligent tool usage.

Nova 2 Sonic – Next‑generation speech‑to‑speech model with industry‑leading latency and expanded language support.

Nova 2 Omni – First truly unified multimodal model supporting text, image, video, and audio inputs and generating both text and images, capable of summarizing entire meetings.

Amazon Nova Forge enables developers to start from an 80 % pre‑trained Nova 2 Lite checkpoint, blend proprietary enterprise data with AWS‑generated training data, apply a provided recipe for pre‑training, then use remote reward functions and reinforcement‑learning fine‑tuning before deploying the model on Bedrock.

Three Frontier Agents

Kiro Autonomous Agent – Automates software development tasks (feature delivery, defect classification, code‑coverage improvement) by integrating with Jira, GitHub, Slack, and learns team patterns. In an internal AWS pilot, a project that would normally require 30 engineers for 18 months was completed by six engineers in 76 days.

Amazon Security Agent – Embeds security checks early in the design phase, automatically reviews design documents, predicts risks before coding, identifies code vulnerabilities, and provides real‑time feedback via GitHub pull‑request integration, eliminating costly re‑writes and reducing reliance on external consultants.

Amazon DevOps Agent – Acts like a senior DevOps engineer, analyzing incidents, identifying optimization points across resources, telemetry, codebases, and CI/CD pipelines, and proactively preventing future events.

Customer Case Studies

Vialet (Biotech) – Trained a “scientific multi‑task” AI on AWS infrastructure, processing tens of trillions of scientific reasoning tokens and accelerating drug‑discovery experiments, with an expected 100× growth in token volume.

Greatdeal (Content Marketing) – Orchestrated over 20 steps and multiple specialist roles using Amazon Bedrock and Nova agents, dramatically speeding up content production.

Sony Data Ocean – Processes 760 TB daily from 500+ sources, serving 57 000 Bedrock users and >150 000 inference requests per day; uses Bedrock AgentCore to centralize AI capabilities and achieve a 100× efficiency boost in compliance reviews.

Reddit – Integrated proprietary community data into Nova Forge to create a custom model that meets both accuracy and cost targets for content moderation, combining general language understanding with Reddit‑specific knowledge.

25 Core Service Innovations

Compute – New X‑series large‑memory instances (50 % more memory), C8a (30 % performance uplift on AMD EPYC), C8ine (2.5× packet‑processing per vCPU), M8azn (highest CPU clock frequency), EC2 M3 Ultra Mac and M4 Max Mac (Apple silicon), and Lambda durable functions for stateful, long‑running workloads.

Storage – Amazon S3 object size limit raised from 5 TB to 50 TB, batch operations 10× faster, intelligent tiering saves up to 80 % of storage costs, cross‑region table replication, FSx for NetApp ONTAP support, and native vector storage with 90 % cost reduction.

Database – RDS for SQL Server/Oracle capacity increased to 256 TB (4× IOPS/bandwidth), vCPU‑based licensing for SQL Server, support for SQL Server Developer edition, and Database Savings Plans delivering up to 35 % savings.

Other Services – Amazon EMR Serverless removes the need for local storage, GuardDuty now protects ECS containers, Security Hub adds near‑real‑time risk analysis, and CloudWatch unifies operational, security, and compliance data into a single S3‑backed store.

Vision for the Agentic AI Era

Garman concluded that AI agents are transitioning from “technical miracles” to core production tools that can multiply human impact by tenfold. By delivering a full‑stack of infrastructure, models, data tools, and agent‑building platforms, AWS aims to enable billions of agents to run across industries, delivering ten‑times efficiency gains and reshaping enterprise operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud ComputingMachine LearningAI agentsAWSGenerative AIInfrastructure
Amazon Cloud Developers
Written by

Amazon Cloud Developers

Official technical community of Amazon Cloud. Shares practical AI/ML, big data, database, modern app development, IoT content, offers comprehensive learning resources, hosts regular developer events, and continuously empowers developers.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.