Operations 23 min read

Inside BlueKing: How Tencent Game’s Ops Platform Automates, Scales, and Powers Data‑Driven Operations

This article outlines the evolution of Tencent Game’s BlueKing platform—from its origins in operations transformation to its modular design philosophy, six integrated services, and three maturity stages—showcasing how atomic automation, PaaS tools, and real‑time data empower scalable, data‑driven application operations.

Efficient Ops
Efficient Ops
Efficient Ops
Inside BlueKing: How Tencent Game’s Ops Platform Automates, Scales, and Powers Data‑Driven Operations

1. Background: Operations Transformation

Ten years ago Tencent Game’s operations were focused on manual, demand‑driven tasks such as server, network, OS, DB, release, change, monitoring, and incident handling. Five years ago a small team began shifting from pure "operational service output" to "solution‑oriented service output". Three years later the entire operation teams embarked on a difficult transformation, and the BlueKing system was built to support it.

The transformation was driven by three main reasons:

Intense market competition required finer‑grained operations tools and higher availability for products, planners, and developers.

The traditional operations role was shrinking as new technologies reduced the need for manual tasks.

Operational staff were exhausted by the sheer volume of release, change, and incident work.

The long‑term goal became to automate basic operations (release, monitoring, data extraction) to achieve unattended operations and provide solution‑oriented tools.

2. Design Philosophy

BlueKing’s design avoids dependence on any specific business architecture, technology stack, or uniform workflow. It abstracts each manual step into an atomic unit, automates the atom, and connects atoms through a task engine into linear or tree‑shaped workflows, similar to an SOA approach.

Two key activities enable this:

Encapsulating command‑line steps as reusable automation atoms on the BlueKing Job platform.

Integrating UI‑driven steps via the BlueKing Integration Platform’s ESB, turning them into callable atoms.

By keeping atoms technology‑agnostic, any operation that a human can perform via Linux commands can be automated.

3. BlueKing Platform Components

BlueKing consists of six platforms:

BlueKing Integration Platform : Provides PaaS, ESB, development framework, and web samples for building operation tools.

BlueKing Mobile Platform : Mobile entry point for the ecosystem.

BlueKing Job Platform : Handles file transfers and script execution as services.

BlueKing Configuration Platform : Stores hierarchical business attributes and provides API access.

BlueKing Control Platform : Offers standardized agents for OS, container, and big‑data control.

BlueKing Data Platform : Built on Kafka and Storm, it delivers real‑time computation, online IDE with YAML‑based logic, and data dictionaries for operational decision support.

4. Maturity Stages

Stage 1: Basic Automation

Repeated, environment‑triggered tasks such as scaling, merging servers, and incident handling are fully automated, freeing operators to sleep at night.

Stage 2: Assist Product Operations

Operations that are triggered by product, planning, or development teams (e.g., releases, configuration changes, data extraction) are packaged as self‑service apps on the Integration Platform, allowing non‑operations staff to execute them directly.

Stage 3: Data‑Driven Operations

The Data Platform lowers the barrier for operations to perform real‑time analytics on logs, user behavior, and operational metrics using YAML‑described logic and Storm processing, enabling data‑driven decision support and automated operational insights.

Operations can now provide high‑value analytics that development teams cannot afford to build themselves.

5. Services: PaaS and SaaS

All core services are offered as PaaS: developers create custom apps on the Integration Platform, write scripts on the Job Platform, and build dashboards on the Data Platform. To avoid duplicated effort, BlueKing also provides SaaS solutions such as the "Standard Ops" app for release automation and a generic "Fault Self‑Healing" service for incident recovery.

These offerings enable operations to deliver higher‑dimensional services without replacing the operations function, supporting product, planning, development, and testing teams across the organization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Automationplatform
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.