Industry Insights 6 min read

OpenAI Unveils Its Own AI Inference Chip: What It Means for the Industry

OpenAI has partnered with Broadcom to launch Jalapeño, a purpose‑built AI inference ASIC designed in nine months, promising superior performance‑per‑watt, integrated networking, and a full‑stack AI hardware‑software optimization cycle that could lower inference costs and reshape future data‑center deployments.

Old Zhang's AI Learning

Jun 24, 2026

OpenAI Unveils Its Own AI Inference Chip: What It Means for the Industry

OpenAI announced its first self‑designed inference ASIC, Jalapeño, co‑developed with Broadcom.

What the chip is

Jalapeño is a pure inference chip built from scratch for large‑model inference. Early tests claim a performance‑to‑power ratio far beyond the current state‑of‑the‑art, though exact numbers are not yet released.

Design‑to‑tapeout in 9 months – claimed to be the fastest ASIC development cycle in high‑performance semiconductor history.

Engineering samples run GPT‑5.3‑Codex‑Spark workloads at target frequency and power.

Broadcom’s Tomahawk networking chip provides support for massive cluster deployments.

Design philosophy

The chip aims to reduce data movement and balance compute, memory, and network resources so that actual utilization approaches theoretical peaks; most GPUs achieve less than 50 % utilization on large‑model inference.

Why OpenAI builds its own chip

OpenAI’s inference demand for ChatGPT, Codex, API and agents is enormous, and relying on NVIDIA GPUs faces three problems: high price (especially during H100/B200 shortages), unstable supply, and wasted general‑purpose architecture. A custom ASIC can address all three by enabling tight software‑hardware co‑optimization.

9‑month tape‑out: AI designs AI chips

Typical high‑performance ASICs take 18–24 months; OpenAI halved this to nine months by using its own models to accelerate chip design, creating a virtuous flywheel:

Better models help design better chips.

Better chips run better models more efficiently.

Better models lead to better products, more revenue, and investment in the next generation of chips.

Deployment plan: gigawatt‑scale

OpenAI plans to start deploying the first‑generation Jalapeño by the end of 2026, building gigawatt‑level data centers together with Microsoft and other partners. Broadcom will handle chip implementation and networking, while Celestica will provide boards, racks and system integration.

Implications for users

Faster, cheaper inference means quicker ChatGPT responses, more steps for Codex, lower API costs, and fewer throttling events during peak usage. OpenAI states that making advanced models affordable and reliable for everyone is the goal, though short‑term impact on NVIDIA is expected to be limited.

Outlook

OpenAI’s move signals its ambition to become a full‑stack AI infrastructure company. The real test will be how second‑ and third‑generation chips reshape the inference market in the next two to three years.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

OpenAI data center ASIC AI hardware Broadcom AI inference chip Jalapeño

Written by

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.