Designing High‑Availability Services: Architecture Boundaries, Protocols, and Push Systems
This article explains how Tencent’s internal high‑availability service curriculum emphasizes architecture boundaries, unified protocol definitions using JCE, a unified PushAPI, monitoring and feedback mechanisms, and the organizational impact of aligning system and team boundaries to achieve scalable, reliable backend services.
Background: Tencent offers a popular internal series of courses on massive‑scale services, focusing on high availability to support millions of daily active users. The core idea is that usability and high‑availability are the foundation of large‑scale internet services.
Architecture Boundaries: When designing post‑massive‑scale systems, the primary concern is defining clear boundaries. Key points include boundary thinking, responsibility separation, contract spirit, high cohesion, low coupling, and clear layering.
Example – HTTP Interaction: An app communicates with the backend via HTTP POST. The protocol layout includes a 3‑byte magic header (YYB), a 4‑byte version field, and a body defined by JCE structures. Sample JCE definitions:
struct ReqHead { Int cmdId; ... } struct Request { ReqHead head; vector body; } struct PkgReq { PkgReqHead head; Request request; }
Push Subsystem: A unified PushAPI interface abstracts single‑device, multi‑device, all‑online‑device, and all‑device pushes, hiding vendor‑specific channels (GCM, APNS, Xiaomi, Huawei, etc.). The JCE file PushAPI.jce defines request/response structures, ensuring a single contract for business layers.
Monitoring & Feedback: The Push subsystem persists tasks, provides high‑availability, and records delivery status. Monitoring dashboards show exception rates, traffic, and latency, enabling rapid diagnosis and adjustment of load‑balancing thresholds.
Organizational Impact: When system boundaries align with team boundaries, clear contracts, unified monitoring, and consistent metrics reduce friction between teams. The article describes a conversion layer (PDUBridgeServer) that simplifies L5 service calls, centralizes monitoring, and reduces deployment complexity.
Feedback Loop: Architecture feedback includes health metrics, call chains, performance data, and business statistics. Comprehensive monitoring beyond simple alerts informs iterative improvements to the system.
References: The article lists further reading on architecture evolution from major tech companies.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.