Operations 14 min read

Full-Link Pressure Testing Automation Practice for Bilibili's Live Streaming Gifting Business

Bilibili automated full‑link pressure testing for its high‑traffic live‑stream gifting service by adopting traffic co‑location with storage isolation, creating shadow tables, keys and topics, and building a three‑phase, three‑layer framework that analyses links, confirms configurations, and verifies end‑to‑end behavior across all services.

Bilibili Tech

Jul 26, 2022

Full-Link Pressure Testing Automation Practice for Bilibili's Live Streaming Gifting Business

This article details Bilibili's practice of implementing full-link pressure testing for their live streaming gifting business, which exhibits high write operations, traffic spikes during major events, and strict real-time data requirements. The traditional pressure testing approaches could not accurately simulate production conditions due to various shielding and blacklist processing for write scenarios.

The article first compares three industry-standard full-link pressure testing approaches: traffic co-location with storage isolation and online stress testing; data marking with logical isolation and online stress testing; and mirror environment or offline testing. Bilibili chose the first approach based on their unified language stack, consistent infrastructure components, and mature service governance.

Bilibili's full-link pressure testing solution consists of three main components: traffic co-location (sharing resources with online clusters during low-traffic periods, using traffic marking to distinguish test traffic), online stress testing (through their pressure testing platform), and storage isolation (creating shadow tables for databases, shadow keys for Redis, and shadow topics for message queues).

The core challenge was testing numerous service modifications across revenue core services, underlying middleware, pressure testing SDK, console, and stress platforms. The authors designed a comprehensive automated testing solution divided into three phases: ensuring basic capabilities through testing new nodes like mirror SDK and pressure testing console; implementing full-link automation for business access and full-process verification; and building platformization and visualization for future scaling.

The automated testing solution includes three main parts: link analysis (using trace tracking and static code scanning tools like biliconfigcheck lint to ensure context propagation), configuration confirmation (configuring pass-through, mirroring, write-discard, and mock rules for interfaces, databases, caches, and message queues), and automated verification (validating interface responses, storage operations, async business flows, and link completeness).

The automation framework was redesigned with three layers: case layer for single-interface and scenario test orchestration, invoker layer for request encapsulation and assertion management, and coverage layer for test coverage statistics. Key modifications included adding a "mirror" identifier controlled by a global variable, implementing trace_toolset for link completeness checking, and adding pressure testing markers to HTTP/gRPC request headers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

system stability automated testing performance testing traffic isolation Bilibili full-link pressure testing shadow storage

Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.