DiDi IFX AI Inference Platform: Architecture, Performance, and Productization

DiDi’s IFX AI inference platform, built since 2018, uses a four‑layer architecture spanning access, software, engine, and compute to deliver cloud, edge, and device inference with high‑performance kernel optimizations, model and binary compression, uniform multi‑framework deployment, automated testing, and end‑to‑end security for billions of daily calls.

Didi Tech
Didi Tech
Didi Tech
DiDi IFX AI Inference Platform: Architecture, Performance, and Productization

DiDi’s Machine Learning Platform team introduces the IFX inference engine platform, which has been in production for over two years and now serves millions of devices and billions of daily inference calls across a wide range of business scenarios such as safety, mapping, in‑vehicle cameras, driver‑assist, and international rider apps.

Background – With the rapid development of deep learning, AI has become mature and pervasive. DiDi possesses massive travel data, a large fleet of driver and rider mobile devices, vehicle‑mounted cameras, and GPU clusters, creating a golden era for cloud‑, edge‑, and device‑side AI. Since September 2018 the IFX platform has been built and has been internally available since December 2018, powering critical order and onboarding flows, payment binding, identity verification, financial security, high‑risk detection, cost attribution, collision detection, navigation, and map updates.

Architecture – The platform is organized into four layers:

1. Access Layer : Connects business services, provides inference/authorization telemetry, and offers SDKs for local inference and standard HTTP/Thrift/gRPC APIs for remote inference.

2. Software Layer : Handles model parsing and management, including model slimming, encryption, version control, and automated testing to ensure consistency between training and inference models.

3. Engine Layer : Focuses on engine‑level optimizations such as performance diagnostics, engine slimming, operator optimizations (low‑precision, graph, heterogeneous scheduling, assembly tuning), and system‑level improvements (I/O, pre/post‑processing).

4. Compute Layer : Supports a variety of hardware (NVIDIA GPUs, ARM, X86, Cambricon, etc.) for cloud, edge, and device deployments.

Productization – Building on the architecture, the IFX team delivers a systematic AI deployment solution with six focus areas:

High Performance : Assembly‑level kernel optimizations (40‑200% speedup) and full‑stack optimizations (30‑260% improvement) demonstrated by local and service‑level benchmark charts.

Elegance : Model compression (<25% size reduction without accuracy loss) and engine binary compression (~50% reduction) to shrink SDK size and improve user experience.

Uniformity : A unified deployment approach that allows the same model to run on cloud, edge, and device targets.

Multi‑Framework Support : Compatibility with TensorFlow, PyTorch, Caffe, Darknet, etc., enabling seamless model conversion to the IFX format.

Automation : Automated SDK generation, service load testing, model correctness evaluation, and power/CPU load testing.

Security : End‑to‑end protection including offline/online authorization, code obfuscation for iOS/Android/Linux SDKs, engine‑level encryption and anti‑debugging, and model file encryption.

Conclusion – IFX has already been adopted by many internal DiDi services, yet there remain inefficiencies to address. Future work includes fully online development‑to‑production pipelines, unified development environments, and integrated testing, validation, analysis, and release processes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationEdge Computingmodel compressionplatform architectureAI inference
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.