Backend Development 18 min read

Tuniu’s Journey from Monolithic to Distributed Service Architecture: Serviceization, Price Calculation Service, Governance Platform, Data Center Challenges, Performance Optimization, and App Client Evolution

This article details Tuniu’s transformation from a single‑machine system to a large‑scale distributed architecture, covering the step‑by‑step serviceization process, the evolution of its price‑calculation service, the implementation of a service‑governance platform, data‑center migration challenges, performance‑optimization tools, and the technical evolution of its mobile app client.

Architecture Digest
Architecture Digest
Architecture Digest
Tuniu’s Journey from Monolithic to Distributed Service Architecture: Serviceization, Price Calculation Service, Governance Platform, Data Center Challenges, Performance Optimization, and App Client Evolution

Extracted from the magazine "Programmer" by author Gao Jian.

Tuniu started with a standalone system and has grown to operate hundreds of distributed services. This article shares the problems encountered and solutions applied during the scaling of Tuniu’s wireless website system, focusing on serviceization, data‑center issues in South Beijing, performance improvements, and App client technology evolution.

Serviceization Progress

Tuniu began serviceizing in 2011 with membership services, followed by Search 2.0 in 2012, and a rapid expansion in 2013 that included Search 3.0, price center, order center, and product data services. In 2014, the Tuniu Service Governance Platform (TSP), common business systems, and resource search were serviceified, and in 2015, product categories and open APIs were added. Each split is likened to changing tires on a high‑speed car.

Search 2.0 used Solr but lacked clear boundaries between the search platform and business systems, leading to heavy logic, low indexing efficiency, and unstable performance. Lessons learned led to Search 3.0, which provides only list search, centralizes list fields, moves data‑push logic to product systems, and focuses on performance, stability, intelligent ranking, and manual result intervention.

Price Calculation Service

The service faces two challenges: many price‑affecting factors with deep dependency paths, and high‑frequency price changes during peak seasons, requiring high capacity and real‑time processing.

Since 2013, the architecture has evolved through four stages: synchronous, asynchronous, concurrent, and distributed. Figure 1 Price Calculation Service Architecture

Synchronous Architecture: Systems interact via interfaces; the price center calls other systems to fetch all dependent resources, processing sequentially with low efficiency, suitable only for small‑scale calculations.

Asynchronous Architecture: Systems communicate via MQ; the price center reads data from dependent databases, pre‑computes minimum costs per resource, then computes product minimum prices, achieving a three‑fold performance boost.

Concurrent Architecture: Data is sharded and partitioned; hot data is computed more frequently than cold data; in‑memory structures for trips, itineraries, and resources improve read/write efficiency, delivering a 3.5× performance increase and keeping calculation time under 200 ms per group.

Distributed Architecture: Binlog parsing converts database changes into in‑memory structures; Sharding MQ enables local access and computation; Unix domain sockets provide local communication, maximizing I/O and reducing loss, achieving a further 2× speedup with calculation time under 100 ms per group. Figure 2 Overall Distributed Architecture

By May 2015, the service handled roughly 900 million calculations daily, with each travel package calculated more than twice per day.

Service Governance Platform

Rapid serviceization increased the number of interfaces, leading to mesh calls, circular dependencies, avalanche risks, lack of monitoring, and hardware‑based load balancing with poor maintainability. An open‑source governance platform was customized for Tuniu. Figure 3 Tuniu Service Governance Platform

The registry uses a master‑slave cluster; the master handles address changes and heartbeats, while the slave provides query services and takes over if the master fails. Services register themselves, undergo manual approval, and heartbeats trigger reconnection logic. Load‑balancing limits concurrent connections to preserve availability.

South‑Beijing Data‑Center Pain Points

Before 2014, Tuniu operated a dual‑data‑center model (Beijing for web, Nanjing for order processing). As traffic grew, the model faced synchronization delays, especially over unstable dedicated lines. Solutions included circuit‑breaker processes that switch to public VPN when latency spikes, data compression, and later a dynamic CDN that routes mobile traffic via the nearest transit server.

The new single‑data‑center approach reduced deployment costs by over 30% and set the stage for a future two‑site, three‑center high‑availability architecture.

Performance Optimization

Three tools were introduced:

Codis – a Go/C‑based Redis proxy cluster compatible with Twemproxy, providing transparent distributed caching.

BWT – an active cache‑update service that pushes data changes to caches with configurable delay and load‑aware throttling.

OSS – an in‑house website operation monitoring system that collects logs via UDP, processes them with NSQ queues, stores results in a database, and visualizes performance, availability, and security metrics. Figure 5 OSS System Architecture

These tools enable rapid fault detection, interface monitoring, slow‑SQL tracking, and single‑page performance analysis.

App Client Technology Evolution

The mobile app adopted two key practices:

Online Hot Patching: Using Alibaba’s hot‑patch technology to deploy fixes without full app releases, avoiding server‑side feature toggles, H5 redirects, or costly emergency releases.

Frontend Resource Staticization: Asynchronous loading of static assets, DOM optimization, lazy loading, GPU‑accelerated rendering, and bundling to improve first‑screen performance in the app’s WebView.

Overall, the logical architecture emphasizes service‑oriented design, the physical architecture addresses data‑center constraints, and the system architecture focuses on non‑functional improvements such as performance and reliability.

Future growth will bring further complexity and technical challenges, inviting continued sharing of best practices.

distributed systemsPerformance Optimizationbackend developmentservice governanceservice-oriented architectureprice calculation
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.