Inside BlueKing: How Tencent Game’s Ops Platform Automates, Scales, and Powers Data‑Driven Operations
This article outlines the evolution of Tencent Game’s BlueKing platform—from its origins in operations transformation to its modular design philosophy, six integrated services, and three maturity stages—showcasing how atomic automation, PaaS tools, and real‑time data empower scalable, data‑driven application operations.
1. Background: Operations Transformation
Ten years ago Tencent Game’s operations were focused on manual, demand‑driven tasks such as server, network, OS, DB, release, change, monitoring, and incident handling. Five years ago a small team began shifting from pure "operational service output" to "solution‑oriented service output". Three years later the entire operation teams embarked on a difficult transformation, and the BlueKing system was built to support it.
The transformation was driven by three main reasons:
Intense market competition required finer‑grained operations tools and higher availability for products, planners, and developers.
The traditional operations role was shrinking as new technologies reduced the need for manual tasks.
Operational staff were exhausted by the sheer volume of release, change, and incident work.
The long‑term goal became to automate basic operations (release, monitoring, data extraction) to achieve unattended operations and provide solution‑oriented tools.
2. Design Philosophy
BlueKing’s design avoids dependence on any specific business architecture, technology stack, or uniform workflow. It abstracts each manual step into an atomic unit, automates the atom, and connects atoms through a task engine into linear or tree‑shaped workflows, similar to an SOA approach.
Two key activities enable this:
Encapsulating command‑line steps as reusable automation atoms on the BlueKing Job platform.
Integrating UI‑driven steps via the BlueKing Integration Platform’s ESB, turning them into callable atoms.
By keeping atoms technology‑agnostic, any operation that a human can perform via Linux commands can be automated.
3. BlueKing Platform Components
BlueKing consists of six platforms:
BlueKing Integration Platform : Provides PaaS, ESB, development framework, and web samples for building operation tools.
BlueKing Mobile Platform : Mobile entry point for the ecosystem.
BlueKing Job Platform : Handles file transfers and script execution as services.
BlueKing Configuration Platform : Stores hierarchical business attributes and provides API access.
BlueKing Control Platform : Offers standardized agents for OS, container, and big‑data control.
BlueKing Data Platform : Built on Kafka and Storm, it delivers real‑time computation, online IDE with YAML‑based logic, and data dictionaries for operational decision support.
4. Maturity Stages
Stage 1: Basic Automation
Repeated, environment‑triggered tasks such as scaling, merging servers, and incident handling are fully automated, freeing operators to sleep at night.
Stage 2: Assist Product Operations
Operations that are triggered by product, planning, or development teams (e.g., releases, configuration changes, data extraction) are packaged as self‑service apps on the Integration Platform, allowing non‑operations staff to execute them directly.
Stage 3: Data‑Driven Operations
The Data Platform lowers the barrier for operations to perform real‑time analytics on logs, user behavior, and operational metrics using YAML‑described logic and Storm processing, enabling data‑driven decision support and automated operational insights.
Operations can now provide high‑value analytics that development teams cannot afford to build themselves.
5. Services: PaaS and SaaS
All core services are offered as PaaS: developers create custom apps on the Integration Platform, write scripts on the Job Platform, and build dashboards on the Data Platform. To avoid duplicated effort, BlueKing also provides SaaS solutions such as the "Standard Ops" app for release automation and a generic "Fault Self‑Healing" service for incident recovery.
These offerings enable operations to deliver higher‑dimensional services without replacing the operations function, supporting product, planning, development, and testing teams across the organization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
