Improving Supplier Integration Efficiency and System Stability in Ctrip's Direct Connection Platform
This article presents Ctrip's backend engineering practices for the Direct Connection Platform, detailing how a sandbox testing environment, automated acceptance, rate‑limiting, and circuit‑breaking mechanisms were introduced to boost supplier onboarding speed and enhance overall system stability.
In the fast‑growing Ctrip ticketing business, the Direct Connection Platform integrates multiple supplier order and product systems via OpenAPI, but increasing supplier volume creates challenges in onboarding efficiency and system stability.
The platform consists of two main data flows: synchronizing supplier product information (price, inventory, content) and order status updates. OpenAPI and sandbox tools are provided for quick supplier integration.
Key challenges identified are:
Efficiency: each new supplier required about two person‑days of joint testing and acceptance, causing long onboarding cycles.
Stability: heterogeneous supplier networks, varying capacities, and occasional outages lead to order failures and high error rates.
To address these, Ctrip built a sandbox that supports self‑service testing, automatic acceptance, and scenario‑based test cases. The sandbox can:
Define test cases covering functional, exception, and boundary scenarios.
Match test cases to suppliers based on interface type, sync/async mode, and business parameters.
Execute test cases automatically, handling both platform‑to‑supplier and supplier‑to‑platform calls.
Perform auto‑acceptance; once all cases pass, the supplier is automatically promoted to production.
After sandbox deployment, supplier onboarding speed increased more than eightfold and average onboarding effort dropped from 23 person‑days to 6 person‑days.
For system stability, Ctrip introduced rate‑limiting using a leaky‑bucket algorithm to smooth traffic spikes and protect supplier systems, and implemented both system‑level and business‑level circuit‑breaking. The circuit‑breaker monitors long‑ and short‑term error rates, calculates break duration with the formula t + p·L(p)·n, and applies degradation actions such as disabling sales or taking resources offline.
Additionally, Ctrip classified supplier error messages into six categories (system, traveler info, purchase limit, inventory, product config, account balance) and built a keyword‑matching engine to trigger appropriate automated responses.
These measures reduced the overall order failure rate from 0.34% to 0.05%, demonstrating significant improvements in both supplier integration efficiency and platform stability.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.