Operations 13 min read

Didi’s Evolution of Testing Environments: From All‑in‑One to Fastdev, TiP, and OSim

Didi evolved testing environments from an All‑in‑One monolith to Fastdev traffic replay, Test‑in‑Production (TiP) with side‑car proxies, and offline simulation (OSim), each addressing scaling, fidelity, and safety, and all remain necessary for reliable testing as the platform grows.

Didi Tech
Didi Tech
Didi Tech
Didi’s Evolution of Testing Environments: From All‑in‑One to Fastdev, TiP, and OSim

Didi’s rapid growth and increasing business complexity have put pressure on collaboration and iteration efficiency. To address these challenges, Didi has continuously explored and refined its testing environment since the company’s inception, accumulating valuable experience and lessons.

1. All in One – In the early micro‑service stage, the number of services was limited (only a dozen or so). Didi packaged all services together and built a single “All in One” environment, which was easy to maintain manually. As the number of micro‑services grew to thousands, this approach became unsustainable.

2. Fastdev – To replace the monolithic All in One setup, Didi turned to contract‑based testing (Pact) and later to traffic recording and replay. Starting in 2017, Didi recorded production traffic, replayed it offline, and open‑sourced the tools https://github.com/didi/rdebug (PHP) and https://github.com/didi/sharingan (Go). Fastdev can faithfully simulate online scenarios, but it still requires manual editing of traffic when business logic changes, and the effort can exceed that of Pact for complex services.

3. Test in Production (TiP) – Recognizing the high cost of maintaining full‑stack environments, Didi leveraged pre‑release (pre‑prod) environments that mirror production except for real traffic. By isolating test accounts and using logical isolation, each developer can obtain a stable, high‑fidelity test environment. Since 2018, Didi built a TiP‑Sim environment using side‑car proxies (similar to Service Mesh) to perform trace‑ID based traffic coloring and splitting, achieving a closed‑loop traffic flow without modifying the ingress or routing layers.

4. OSim (Offline Simulation) – To eliminate the risk of contaminating production, Didi created an offline simulation environment that physically isolates the network. All services, including infrastructure components (storage, configuration, logging, etc.), are duplicated offline. The design follows three principles: (1) offline services are deployed in sync with online services; (2) each service is owned by the team that develops it; (3) unified technical solutions (RPC, service discovery, tracing) are reused to reduce maintenance cost.

By reusing cloud‑native IaC capabilities, Didi can quickly provision the offline environment, apply the same monitoring, on‑call, and availability metrics as production, and verify the scalability of its stability solutions.

Conclusion – Didi’s testing‑environment journey consists of four major modes: All in One, traffic recording & replay (Fastdev), online simulation (TiP‑Sim), and offline simulation (OSim). Each mode serves specific scenarios, and none can fully replace the others. Continuous investment in all four is essential because a reliable testing environment underpins all testing capabilities.

MicroservicesDevOpsservice meshTesting EnvironmentFastdevOSimTiP
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.