How Toutiao Scaled to Millions of QPS with Go‑Powered Microservices
This article chronicles Toutiao’s evolution from a monolithic Python/C++/PHP stack to a large‑scale Go‑based microservice architecture, detailing the reasons for adopting Go, the design of the kite framework, concurrency models, timeout control, performance tuning, monitoring, and a reusable DAO component for efficient RPC aggregation.
Toutiao grew from a modest daily‑traffic app to a platform handling billions of requests per day. Before mid‑2015 the service was built mainly with Python, C++ and PHP, forming a large monolith that became difficult to scale.
Microservice Evolution
To address coupling and complexity, Toutiao migrated to a Service‑Oriented Architecture (SOA) and eventually to a full microservice architecture. The microservice model brings process decoupling, easier management, self‑containment, deployment independence and automation.
The content publishing system originally used Django and PHP, which introduced bottlenecks in process management. Extracting functionality into independent services was necessary.
Why Go?
Toutiao switched many services to Go because its syntax is simple, compilation is fast, performance is high, it offers native concurrency with goroutine and channel primitives, and deployment packages are small with minimal dependencies.
In June 2015 the team began rewriting the Feed service in Go, completing most of the migration by June 2016. The Go‑based platform now runs hundreds of microservices, peaks at 7 million QPS and processes over 300 billion requests daily.
Microservice Framework – kite
The internal framework kite is fully compatible with the Thrift protocol. It provides service registration and discovery, distributed load balancing, timeout and circuit‑breaker management, method‑level metrics, and distributed tracing, enabling unlimited horizontal scaling.
Concurrency
Go’s concurrency model is based on CSP: goroutines and channels replace OS threads with lightweight user‑space tasks, allowing tens of thousands of concurrent executions. This model simplifies reasoning about parallel logic and improves maintainability.
Example CSP prime‑sieve implementation (illustrated in the article) shows how each line of the pipeline runs in its own goroutine, communicating via channels.
ContentTask = NewContentInfoTask(id=123)
CommentTask = NewCommentsListTask(ContentId=123)
ParallelExec(ContentTask, CommentTask) // parallel RPC calls
user_id = ContentTask.Response.User_id
UserResp = NewUserTask(user_id).Load()Concurrency Control
Two patterns are used: Wait – the main goroutine waits for all parallel RPC calls to finish; Cancel – if a global timeout expires, remaining RPCs are cancelled to avoid resource leakage. Go provides sync.WaitGroup and context.Context to implement these patterns.
Timeout Control
Proper timeout settings prevent cascading failures in large call graphs. The article illustrates a request flow where a gateway aggregates results from five downstream services, each with its own timeout, and shows how Go’s SetWriteDeadline and SetReadDeadline are used for connection, write and read timeouts.
In the kite client library a “Concurrent Ctrl” module limits the number of simultaneous RPCs and enforces precise timeout boundaries.
Performance
Go outperforms many traditional web back‑ends, but developers must still profile and tune services. Built‑in tools such as pprof, CPU and memory profiling, goroutine stack inspection, and trace analysis help identify bottlenecks.
Key optimization tips include: lock only variables, prefer CAS, focus on hot paths, consider GC impact, reuse objects (e.g., via sync.Pool), avoid reflection, tune GOGC, and keep Go versions up‑to‑date.
A real‑world case study shows a storage service where reducing memory allocations, switching from Thrift to Msgpack, and reusing buffers cut 99th‑percentile latency from 100 ms to 15 ms.
Service Monitoring
The runtime package exposes metrics such as goroutine count, GC pause time, and heap usage. kite collects these in real time and sets alert thresholds for critical metrics.
Go Programming Thinking and Engineering
Go forces a different mindset: each service runs in a single process, panics crash the process, and there is no thread‑local storage, so context propagation is explicit. Concurrency is the norm, requiring careful handling of shared resources.
The language’s simplicity and built‑in AST tools make large codebases easier to manage compared with languages that allow many idioms.
Reusable DAO Component for Toutiao’s “NeiHan Duanzi” Service
The article proposes a DAO layer that aggregates RPC, DB and cache calls, builds a dependency tree, and executes basic and sub‑loads concurrently. Basic properties depend only on the primary key, while sub‑properties depend on basic ones. The component uses maps such as BASIC_LOADER_MAP and SUB_LOADER_MAP to associate loaders with RPC functions.
func DaoLoad(needParamsTree, daoList, paramLoaderMap, subLoaderMap) error {
// build basic and sub task lists
// execute them concurrently
}Clients specify required fields via a simple string slice, e.g.,
[]string{"Content_Info", "Content.User_Info", "Content.Comment_Info"}, and the loader automatically resolves dependencies, reduces redundant code, and speeds up data retrieval.
Conclusion
Toutiao’s migration to Go enabled a high‑performance, highly concurrent microservice platform that scales to millions of QPS and fits naturally into a cloud‑native environment. The reusable DAO component further simplifies cross‑service data aggregation while leveraging Go’s concurrency strengths.
Author: Xiang Chao, Senior R&D Engineer at Toutiao, joined in 2015, promoted Go adoption, and developed the internal kite microservice framework.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
