Operations 17 min read

From PHP Coder to DevOps Leader: Tencent CDN & Game Data Ops Insights

This article chronicles Liu Tiansi’s 13‑year journey from a PHP/Java developer to a senior operations engineer at Tencent, detailing his open‑source contributions, large‑scale CDN management, massive game data pipelines, DevOps practices, and practical lessons for building reliable, automated infrastructure.

Efficient Ops
Efficient Ops
Efficient Ops
From PHP Coder to DevOps Leader: Tencent CDN & Game Data Ops Insights

Guest Introduction

Liu Tiansi has 13 years of experience in internet operations, currently working at Tencent Interactive Entertainment, responsible for game big‑data operations. He previously served as chief architect and sysadmin at Tianya Community, focusing on open‑source technologies, load balancing, caching, databases, NoSQL, distributed storage, messaging middleware, big data, cloud computing, Mesos, Docker, and DevOps.

Career Path

He started as a PHP/Java developer in 2003, later joining Tianya Community in 2005 where he helped migrate the infrastructure from Windows to an open‑source stack, achieving 99.99% availability for key services.

Key Achievements at Tianya

Implemented large‑scale open‑source solutions such as LVS, Squid, HAProxy, MongoDB, MySQL, Cfengine.

Led the architecture transformation, overcoming lack of reference material through trial‑and‑error.

Developed open‑source projects like an LVS management platform and a server rack simulation platform, which were adopted by other companies.

Architecture diagrams (images) illustrate the 2010 Tianya IT management architecture.

Tencent CDN Operations

Since joining Tencent in 2011, Liu has managed over 400 CDN acceleration nodes across static content, game downloads, UGC, streaming, and dynamic acceleration, handling more than 10 Tb of bandwidth for services such as QQ, WeChat, and Tencent Video.

Traffic scheduling relies on GSLB and HTTP‑DNS. Tencent’s CDN optimizations include TCP stack tuning, DiskTank for small‑file I/O, and real‑time link adjustments based on global topology measurements.

Challenges and Solutions

Major issues involve domain and content hijacking by small ISPs, addressed through proactive DNS result comparison and passive monitoring, with HTTPS as a mitigation strategy.

Tencent Game Data Operations

From 2013 onward, Liu has built the game data operation platform supporting hundreds of games, processing 7 000 billion logs (≈50 TB) daily and scheduling 100 k tasks.

Key components include a Hadoop‑Hive based TDW data service platform, a custom TDBank transport layer (similar to Flume+Kafka), and the Tglog log collection system with low coupling and standardized XML protocols.

Tglog Overview

Tglog defines log format in XML and supports both UDP and TCP transports, offering low overhead and high reliability for massive log ingestion.

DevOps Perspective

Why DevOps?

DevOps bridges development and operations throughout the software lifecycle, involving developers, ops, business, QA, executives, and partners to improve collaboration, reduce risk and waste, enhance quality, and accelerate delivery.

How to Implement

Adjust assessment and incentive mechanisms.

Bind development, testing, and operations to deliver value together.

Automate extensively to minimize manual intervention.

Provide training and foster collaborative activities.

Define new architectural standards.

Continuous integration, delivery, and deployment are key entry points, illustrated by a DevOps panorama diagram.

Core DevOps Concepts (CALMS)

Culture

Automation

Lean

Measurement

Sharing

Distinguishes Continuous Delivery (ability to deploy at any time) from Continuous Deployment (automatic deployment of passing changes).

Practical Experience

Key Tips

Early communication with stakeholders to align goals and expectations.

Quantify metrics clearly and provide feedback channels.

Use SVN+Jenkins+Docker for CI/CD; adopt HECD architecture for high‑availability Docker infrastructure.

Maintaining Competitiveness

Continuous learning and reflection are essential; allocate 1–2 hours daily for study and apply insights to daily work.

Patent Writing for Ops

Operations innovations—such as platform features, fast secure scans, disaster‑recovery transfer methods, automated testing ideas, or monitoring techniques—can be patented by documenting the method and highlighting novelty.

Q&A

Q1: Does Tglog use UDP for massive log transport?

A: Yes, primarily UDP, with TCP used for critical logs to avoid packet loss.

Q2: How to enforce standardization when developers are busy?

A: Elevate the issue to leadership; emphasize risk of post‑release problems and require pre‑deployment assessments.

Q3: Fat vs. skinny containers in Docker?

A: No fixed rule; choose based on business scenario, aiming for maximum decoupling and treating containers as atomic scheduling units.
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Dockercloud computingDevOpsCDN
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.