Operations 12 min read

How to Slash Cloud Costs: Real Lessons from Tencent’s Video Platforms

This article examines Tencent's cost‑optimization journey for its short‑video and streaming services, breaking down human, server, and bandwidth expenses, explaining precise accounting methods, negotiation tactics, usage‑vs‑efficiency strategies, and global resource‑scheduling techniques to achieve sustainable cost reduction.

Tech Architecture Stories

Jun 10, 2024

How to Slash Cloud Costs: Real Lessons from Tencent’s Video Platforms

Overview

From Q3 2021 to early 2023, cost optimization efforts for Weishi and Tencent Video built on earlier internal sharing.

Rapid Traffic Growth

Mobile internet benefited from population growth, creating large social traffic pools. Applications such as App Store aggregated traffic from WeChat, QQ, and games, generating profit via ads, games, and other commercial models. However, Weishi's content ecosystem lacked stickiness, limiting profit despite high DAU and VV.

Where Costs Come From

Human resources are the biggest cost driver; more staff leads to higher salaries, office costs, and more code, which in turn requires more servers and bandwidth.

Server costs include CPU, memory, disk, electricity, and data‑center maintenance, often billed per CPU core (e.g., 20 CNY per core per month), encompassing all associated expenses.

Bandwidth costs—streaming, static, live, P2P, PCDN—also form a major expense for content‑heavy services.

Accounting the Costs

Each resource cost follows the formula: usage × unit price. Understanding usage measurement rules (e.g., CDN billed on peak bandwidth) is essential for targeted optimization.

Teams must allocate costs to their own resource consumption, making them accountable for the expenses they generate.

Negotiating Prices and Reducing Usage

While consumers lack direct pricing power, they can negotiate, especially for bandwidth. Reducing usage is a technical challenge: identify actual CPU/memory consumption of legacy code and cut waste.

Usage vs. Efficiency

Usage reduction means lowering the amount of resources used (e.g., halving CPU cores). Efficiency improvement means increasing the work done per unit of resource (e.g., raising CPU utilization from 25% to 45%).

Examples

CPU: Adjust microservice replica counts to raise peak utilization.

Redis: Reduce allocated memory from 100 GB to the actual needed 20 GB, improving memory hit rate and cutting costs.

Hit rate: proportion of requests served from memory versus persistent storage.

Storage density: access frequency per GB, indicating whether data should reside in memory or cheaper storage.

Global Efficiency Optimization

Shift offline workloads (e.g., video transcoding) to off‑peak periods, use Kubernetes HPA for mixed online/offline scheduling, and employ algorithmic models (linear regression, LightGBM) to predict resource needs and achieve global optimal resource allocation.

Everyone Is a Small CEO

Cost‑reduction and efficiency‑increase are now measured by ROC (return on capital) rather than ROI, emphasizing overall capital efficiency.

Programmers must balance performance improvements with physical hardware costs, avoiding wasteful over‑provisioning and treating resource usage as a core business metric.

efficiency resource management cost optimization Tencent Cloud Resources

Written by

Tech Architecture Stories

Internet tech practitioner sharing insights on business architecture, technology, and a lifelong love of tech.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.