JD Donates Oxygen xLLM: Open‑Source Large‑Model Inference Engine Boosts China’s AI Infrastructure
JD announced the donation of its Oxygen xLLM inference engine to the OpenAtom Open‑Source Foundation, detailing its service‑engine decoupled architecture, performance breakthroughs across e‑commerce, power and public‑safety workloads, and a roadmap to expand the open‑source AI ecosystem.
Open‑source donation
On 24 June 2026 JD transferred the Oxygen xLLM large‑model inference engine—including copyrights, patents, trademarks and related rights—to the OpenAtom Open‑Source Foundation under the Apache 2.0 license.
Engineering‑Intelligence vision
The next stage of AI infrastructure is described as Engineering Intelligence (EI): a stack that can sense workload characteristics, generate optimal execution plans automatically, and perform self‑aware scheduling and end‑to‑end self‑optimisation.
Architecture
Oxygen xLLM adopts a service‑engine decoupling architecture.
Service layer (xLLM‑Service) : unified elastic scheduling for online and offline tasks, dynamic PD (parameter‑distribution) separation to handle traffic spikes, global KV cache and fast fault recovery for large‑scale production stability.
Engine layer (xLLM‑Engine) : multi‑level pipelines that overlap compute and communication, adaptive graph mode and efficient memory management to handle dynamic inputs and GPU memory allocation, specialised optimisations for MoE, speculative decoding and generative‑recommendation scenarios.
Hardware and access
Provides a unified AI Gateway and an OpenAI‑compatible SDK. Native execution is supported on GPU, NPU and MLU, covering a wide range of domestic AI chips.
Technical highlights
Architectural innovation – service‑engine decoupling enables independent evolution of scheduling and computation.
Performance breakthrough – multi‑level pipelines, adaptive graph mode and dynamic PD separation significantly improve throughput and resource utilisation while meeting strict SLO constraints, surpassing existing state‑of‑the‑art inference frameworks.
Heterogeneous unification – a unified inference abstraction masks hardware and model differences, supporting LLM, VLM, DiT, text‑to‑image/video and generative‑recommendation models on mixed domestic chips.
High‑availability guarantees – global KV cache management, distributed fast‑fail recovery, health monitoring and automatic inspection ensure stable large‑scale production.
Domestic adaptation – a single framework covers multiple domestic chips, filling the gap of “heterogeneous chip unified inference” and lowering deployment barriers.
Industrial validation
In JD e‑commerce customer‑service models, cluster utilisation increased by more than 35 % and P99 latency decreased by 28 %.
In power‑inspection scenarios, efficiency improved three‑fold, outage rates fell by 30 %, and emergency‑repair speed increased by 20 %.
In public‑safety edge inference, inspection efficiency grew by 227 %, concurrency rose by 127 %, and time‑to‑first‑trace was cut by 50 %.
Community adoption
The project’s GitHub repository https://github.com/jd-opensource/xllm has attracted over 1.4 k stars, 235 forks, and participation from major domestic chip and model vendors.
Roadmap
Planned milestones for 2026 include full multimodal support (text‑to‑image, video, Omni), comprehensive adaptation of mainstream domestic chips, and the launch of commercial enterprise services. By 2027 the contributor base is expected to reach around 200, with a focus on industry penetration and establishing Oxygen xLLM as a de‑facto standard for domestic chip inference.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
