IFX: Didi’s In‑House AI Inference Engine Platform – Architecture, Productization, and Performance

The article introduces Didi’s IFX platform, describing its background, four‑layer architecture (access, software, engine, compute), productization efforts such as high‑performance optimizations, model and engine compression, unified deployment across hardware, multi‑framework support, automation, and security enhancements, and concludes with future plans.

DataFunTalk
DataFunTalk
DataFunTalk
IFX: Didi’s In‑House AI Inference Engine Platform – Architecture, Productization, and Performance

Background – With the rapid development of artificial‑intelligence technologies, deep‑learning has become pervasive in industry. Didi leverages massive ride‑hailing data, driver‑side devices, in‑car cameras, and GPU clusters to build a cloud‑edge‑device AI ecosystem. Since September 2018, the Didi Machine‑Learning team has built the self‑developed inference engine platform IFX, which went live internally in December 2018 and now serves millions of devices with daily call volumes exceeding ten trillion.

Architecture

Access Layer – Provides SDKs for local inference in various programming languages and standard service APIs (HTTP/Thrift/GRPC) for remote inference, along with authorization and telemetry for device and inference metrics.

Software Layer – Handles model parsing and management, offering model slimming, encryption, version control, and automated testing to ensure consistency between training and inference models and to evaluate performance on target hardware.

Engine Layer – Centralizes engine‑level optimizations: performance diagnostics, engine slimming and obfuscation, operator optimizations (low‑precision, graph, heterogeneous scheduling, assembly‑level auto‑tuning), and system‑level improvements such as scheduling, I/O, and pre/post‑processing.

Compute Layer – Supports a wide range of hardware (NVIDIA GPUs, ARM, x86, Cambricon, etc.) across cloud, edge, and device scenarios.

Productization

High Performance – Assembly‑level kernel optimizations and full‑stack (pre‑/post‑processing, network) improvements yield 40‑200% model speedups and 30‑260% service‑level gains.

Compactness – Model compression (<25% size reduction without accuracy loss) and binary ELF compression (~50% SDK size reduction) reduce app package size and improve user experience.

Uniformity – A single model can be deployed to diverse hardware platforms using a unified deployment scheme.

Multi‑Framework Support – IFX converts models from TensorFlow, PyTorch, Caffe, Darknet, etc., ensuring compatibility and smooth upgrades.

Automation – Automates SDK generation, service load testing, model correctness verification, and power/CPU‑load testing.

Security – Implements offline/online authorization, code obfuscation for iOS/Android/Linux, function‑level encryption in the engine, and model file encryption to protect AI assets.

Conclusion – IFX now powers many internal Didi services, yet several inefficiencies remain. The team plans to further automate the development‑to‑production pipeline, unify the development environment, and integrate testing, verification, analysis, and deployment processes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

platform architectureSecurityAI inferenceDidi
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.