How Baidu’s Janus Gateway Powers Millions of QPS with Flexible Routing
This article analyzes Baidu's Janus gateway—why it was built despite BFE, its architecture and deployment, three‑layer routing design, variable and condition expression languages, performance benchmarks versus raw Go code, and how its plugin system enables both generic and highly customized traffic‑dispatch solutions.
Why Janus?
Janus is a high‑performance gateway built to complement Baidu’s existing BFE service. It provides a unified platform that can act as a traffic‑level gateway, a business‑level gateway, or a hybrid gateway. Janus enables each team to deploy and customize the gateway independently, supporting both generic routing and fine‑grained, business‑specific dispatch logic.
Use Cases and Deployment Topology
Janus handles internal traffic for more than a dozen middle‑platform services (feed, comment, like, follow, live) and external traffic for Baidu App, Zhidao, Baike, and other products.
Routing Rule Design
Routing rules are organized into three layers similar to Nginx: domain matching, tree‑based URL routing, and a feature‑matching stage that can execute a minimal scripting language for complex logic.
Variable Expressions
System features are exposed as placeholders that can be referenced in routing rules. Common placeholders include: ${idc} – current data‑center ${time} – current timestamp ${query} – GET query parameters ${header} – request header values
Variables can be hierarchical, e.g. ${request.query.id} refers to the value of the id query parameter in the current request.
Condition Expressions
Janus defines a tiny expression language that supports logical operators (&&, ||, !), comparison functions, and user‑defined functions. Example function calls and logical expressions are shown below.
Performance Comparison
Variable and condition expressions are compiled to Go code on the data plane. Benchmarks on an 11th‑Gen Intel i5 show only ~10 % overhead compared with hand‑written Go:
goos: windows
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz
BenchmarkRandom-8 35817918 34.52 ns/op 0 B/op 0 allocs/op
BenchmarkRawRandom-8 39136900 31.63 ns/op 0 B/op 0 allocs/opPlugin Design: Disaster‑Recovery and Cache
Plugins are configured with condition expressions, allowing the same plugin to be reused with different trigger logic.
Trigger when response code > 499: num_gt(${response.code}, 499) Trigger on 5xx or JSON error code not zero:
num_gt(${response.code}, 499) || (!str_equal(${response.jsonbody.errno}, 0))Trigger on 5xx or SLA status header equals zero:
num_gt(${response.code}, 499) || (!str_equal(${response.header.sla_status}, 0))For the cache plugin, the cache key is defined via variable expressions, enabling services to customize keys, e.g.:
comment_${request.query.id} fans_${request.query.id}_${request.query.uk} homepage_${request.query.uk}Generalization and Future Work
The combination of basic routing, variable expressions, and condition expressions solves the majority of traffic‑dispatch scenarios and can be generalized to other functionalities such as rate limiting, authentication, or custom business logic. Future work includes tighter integration with Go’s standard library to expose more dynamic programming capabilities from the control plane, further expanding Janus’s flexibility for emerging use cases.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
