Artificial Intelligence 7 min read

JD Donates Oxygen xLLM Inference Engine to OpenAtom, Boosting China’s AI Infra Ecosystem

On June 24, 2026 JD announced the donation of its Oxygen xLLM large‑model inference engine to the OpenAtom Open Source Foundation, detailing its service‑engine decoupled architecture, performance breakthroughs, heterogeneous chip support, and real‑world gains in e‑commerce, power‑grid and public‑safety applications while outlining a roadmap for broader ecosystem co‑building and standards leadership.

JD Tech Talk

Jun 25, 2026

JD Donates Oxygen xLLM Inference Engine to OpenAtom, Boosting China’s AI Infra Ecosystem

Open‑source release

On 24 June 2026 JD transferred the copyright, patents, trademark and related rights of its self‑developed large‑model inference engine Oxygen xLLM to the OpenAtom Open Source Foundation under the Apache 2.0 license.

Architecture

Oxygen xLLM adopts a service‑engine decoupled design. The xLLM‑Service layer provides unified online/offline task scheduling, dynamic PD (pre‑emptive‑dispatch) separation for traffic spikes, a global KV cache and fast fault recovery to maintain large‑scale production stability. The xLLM‑Engine layer implements multi‑stage pipelines that overlap computation and communication, adaptive graph execution modes, efficient memory management, and optimizations for MoE, speculative decoding and generative‑recommendation workloads.

Hardware abstraction hides differences among GPU, NPU and MLU, allowing mixed deployment of LLM, VLM, DiT, text‑to‑image/video and generative‑recommendation models on various domestic AI chips.

Key technical capabilities

Architecture innovation – first inference framework that separates scheduling (service) from execution (engine), enabling independent evolution of both components.

Performance breakthrough – multi‑level pipelines, adaptive graph mode and dynamic PD separation increase throughput and resource utilization while meeting strict SLO constraints, surpassing existing state‑of‑the‑art inference frameworks.

Heterogeneous unification – a unified inference abstraction layer masks hardware and model differences, supporting multiple model types and mixed deployment of domestic chips.

High‑availability – global KV‑cache management, distributed fast‑recovery, health monitoring and automatic inspection ensure stable operation at production scale.

Domestic‑chip adaptation – a single framework covers a range of Chinese AI chips, filling the gap of “heterogeneous‑chip unified inference”.

Industrial validation

E‑commerce customer‑service : cluster utilization increased by >35 % and P99 latency decreased by 28 % during high‑traffic promotions.

Power‑grid inspection : inspection efficiency improved ~3×, outage rate fell 30 %, and emergency‑repair speed rose 20 %.

Public‑safety edge inference : inspection efficiency grew 227 %, concurrent requests rose 127 %, and time‑to‑first‑token shortened by 50 %.

Community adoption

Since open‑sourcing, the project has received over 1.4 k GitHub stars, 235 forks and contributions from major domestic chip and model vendors.

Roadmap

Planned milestones include full multimodal support (text‑to‑image/video/Omini), comprehensive adaptation of mainstream Chinese chips and a commercial enterprise edition by the end of 2026, followed by an “industry penetration and standard‑leadership” phase in 2027 to make Oxygen xLLM the de‑facto standard for Chinese AI chips.

Repository

https://github.com/jd-opensource/xllm

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization large model inference open source AI Infrastructure heterogeneous computing engineering intelligence Oxygen xLLM

Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.