QQ’s Full Migration to Tencent Cloud: Architecture, Challenges, and Lessons Learned
The article details how QQ migrated all of its services to Tencent Cloud, describing the business scenarios, migration timeline, technical approaches, challenges such as cost, security, and performance, and the operational and architectural lessons gained from the full cloud transition.
As of now, all QQ services have been migrated to Tencent Cloud. The Tencent Technical Committee, established on January 4, 2019, created two project groups—Open‑Source Collaboration and Self‑Developed Cloud Migration—making QQ the first to complete a full migration.
QQ’s business scenario is characterized by massive burst traffic and massive group messaging, which can amplify load hundreds of times, especially with UDP‑based communication, posing significant challenges for cloud resources and cost optimization.
The migration timeline includes key milestones: in 2017 all QQ users were on private cloud; by the end of 2018, 15% of users moved to the Guangzhou cloud; by June 2019, 30% were on public cloud; and today all users are on public cloud.
QQ’s cloud architecture follows a “three‑cloud‑one‑region” model, distributing users across North, East, and South China, with the South region split into Guangzhou cloud and Shenzhen self‑developed data centers, each operating independently yet capable of cross‑region failover.
Three migration approaches were used: refactor‑then‑migrate, migrate‑while‑refactoring, and migrate‑first‑then‑refactor, with the first two being most common. Containerization began in 2016, and QQ adopted Tencent’s TKE engine, adding custom features such as cross‑region support, IP‑based access control, and predictive capacity scaling.
Key migration difficulties include unpredictable UDP traffic, cost constraints, integrating two decades of legacy technology with cloud‑native solutions, and specific challenges like security, dependency complexity, disaster recovery, and seamless gray‑release for a real‑time messaging product.
Infrastructure migration emphasized zero additional cost; QQ leveraged the internally developed “Star‑Sea” servers, achieving 25% performance gains, and realized benefits in resource consolidation, Kubernetes‑based scheduling, and unified resource pools across public and self‑developed environments.
Migration proceeded in stages aligned with user concurrency: at 5 million concurrent users the focus was on feasibility and packet loss, while at 10 million concurrent users the emphasis shifted to network quality and latency, addressing issues such as VPC session limits and VM buffer sizes.
Database migration employed three methods: cold migration with full backups to cloud Redis clusters, using DTS tools for open‑source components, and direct cloud deployment of private components. MySQL migration used Tencent Cloud DTS for IDC‑to‑cloud transfer, employing DNS‑based service discovery and master‑slave synchronization.
Overall, QQ’s migration to the cloud brought minimal user‑experience impact but delivered significant architectural benefits, serving as a benchmark for Tencent Cloud capabilities and providing valuable experience for other products, including the ongoing cloud migration of WeChat.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.