What We Learned from China’s Top Tech Architects on System Refactoring

Leading architects from companies like Sogou, Ele.me, Xiaomi, Tuniu, Kuaidi, 58.com, and Tencent shared practical insights on progressively refactoring legacy systems, scaling platforms, adopting service‑oriented and streaming architectures, and balancing optimization with new business demands, offering a comprehensive roadmap for modern backend evolution.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
What We Learned from China’s Top Tech Architects on System Refactoring

Fan Gang from Aerospace Information Co., Ltd. shared his summary and reflections on system refactoring and reconstruction based on years of practice. In the context of Internet+, architects face endless requirements, constantly evolving technologies, and heavily patched legacy systems. He proposes a progressive refactoring approach that balances optimization, maintenance, platform development, and reconstruction.

Sogou Business Platform Architect Liu Jian: Evolution of Sogou Business Platform Architecture

Liu Jian presented the evolution of the Sogou Business Platform from an initial all‑in‑one stage to a high‑performance, highly available, and scalable mature system, divided into four phases: initial, horizontal scaling, service‑oriented, and streaming computation.

Initial stage: business‑first, rapid iteration, but rapid growth caused database health issues and timeouts.

Horizontal scaling: solved performance and storage problems via compute and storage scaling; compute scaling required stateless design with independent state storage, while storage scaling used a self‑developed database sharding framework Compass for smooth migration.

Service‑oriented stage: achieved resource privatization, service interface, and module isolation, reducing inter‑team communication cost. The service framework Polaris, built on Thrift, added authentication, authorization, monitoring, failover, and load balancing.

Streaming computation: introduced Kafka + Storm + Pump + Binlog Tunnel; design must consider message ordering, duplication, and loss, with optional dual‑path comparison to reduce risk.

Ele.me Innovation Product R&D Deputy Director Cheng Jun: Ele.me Overall Architecture

Ele.me’s architecture evolved through single‑machine, cluster, and SOA stages. To solve performance issues, the core gateway switched from HAProxy to F5. In the SOA stage, domain‑based database sharding, hot‑data caching, asynchronous messaging, and service degradation were applied. Ele.me also built a self‑developed data middleware DAL for MySQL connection pooling, sharding, read/write separation, and query throttling, and introduced Service Orchestrator for front‑back separation, improving concurrency and response time. Core services are built with Python, showcasing high‑concurrency Python‑based internet services.

Xiaomi Architect Zhang Tao: Xiaomi.com Architecture Evolution

Zhang Tao described Xiaomi.com’s evolution to handle rapid business growth: database sharding for horizontal scalability, restructuring call relationships from mesh to star topology with MQ for decoupling, and a three‑layer design (scheduling, business, data). They introduced Cobar for traffic spikes, abstracted basic services to reduce cost, and followed a “business vertical, platform horizontal” strategy for modularization and platformization. Notable technologies include traffic control architecture, the MCC cache framework (Twemproxy + Redis), asynchronous notification service, inventory allocation algorithms, virtualization, cloud services, monitoring, security policies, and an SOA framework based on Thrift and etcd for service registration, discovery, failover, and multi‑language support.

Tuniu Travel R&D Director Gao Jian: Tuniu Mobile Architecture Evolution

Gao Jian presented Tuniu’s evolution across service‑oriented transformation, multi‑data‑center deployment, performance optimization, and APP technology. Service‑oriented efforts introduced MQ for asynchronous services, database sharding for high‑concurrency, and NoSQL/MySQL heterogeneous sync with Unix domain sockets for distributed computing. Multi‑data‑center deployment tackled unstable dedicated lines and sync latency with application‑level dual‑write for latency‑sensitive data. Performance optimization used Codis for hot‑data caching, along with self‑developed cache update (BWT) and monitoring (OSS) systems. APP development adopted a plug‑in framework, static resources, and Alibaba’s AndFix for hot‑fixes.

Kuaidi (Didi) Architect Wang Xiaoxue: Kuaidi Ride‑Hailing Architecture Practice

Wang Xiaoxue reviewed Kuaidi’s architecture evolution from basic functionality to core link optimization and systematic design. Core link issues included LBS query bottlenecks and unstable long‑connections, with challenges in MongoDB locking, memory copying, and timeout algorithms. Systematic design introduced distributed transformation using Dubbo + RocketMQ, log collection with Log4j + Flume + Elasticsearch, and real‑time computation with RocketMQ + Storm + HBase. Data source transformation replaced MySQL queries with HBase, using a mock MySQL slave to stream binlog data via MQ to HBase/HDFS, and implemented client‑side secondary indexing for HBase.

58.com System Architect Sun Xuan: 58.com High‑Performance Mobile Push Architecture Evolution

Sun Xuan described the evolution of 58.com’s mobile Push platform from a single‑platform iOS solution (APSN) to a multi‑platform high‑performance system. Android push used a self‑developed provider and third‑party integration, abstracting common logic for diverse requirements. The high‑performance stage unified iOS/Android push, parallelized and asynchronous processing, and selected device‑specific channels with retry mechanisms to improve stability.

Tencent Senior Engineer Xu Hanbin: QQ Membership Activity Platform Architecture Evolution

Xu Hanbin presented the architecture of the AMS (Activity Management System) for QQ membership activities. The system is divided into front‑end, CGI, and service layers, with front‑end components, CGI handling business logic, and services providing platform support. Configuration files enable product teams to launch activities independently. Performance is boosted by NoSQL and optimistic CAS locking, achieving over 50k operations per second. High availability is ensured through horizontal scaling, failover, overload protection, and service degradation, complemented by multi‑channel monitoring. Security combines technical measures, product design, and support systems to protect business operations.

Conclusion

Listening to these frontline architects provided valuable insights across many domains, highlighting the core responsibilities of architects to evaluate business, personnel, and cost factors, ensure stable service delivery, and drive smooth upgrades while tackling technical challenges.

Full‑stack architecture: The talks covered everything from foundational infrastructure to business forms, system design to continuous delivery, security, monitoring, virtualization, version iteration, and ROI, emphasizing the architect’s role as both explorer of new technologies and remover of bottlenecks.

Converging principles, divergent details: Common themes such as service‑orientation, sharding, asynchrony, distribution, security, and monitoring recur, yet differences in focus—business vs. data security, strong vs. eventual consistency—reflect varied business scenarios.

Embracing change: Rapid business growth forces architectural optimization and upgrades, while emerging technologies like service‑orientation and streaming computation drive evolution, requiring architects to stay observant, lead technical direction, and practice continuous adaptation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Microservicessystem refactoringScalable Systemsservice-oriented architecture
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.