An Overview of TarsBenchmark: A High‑Performance Microservice Load‑Testing Tool
TarsBenchmark is an open‑source, high‑performance microservice load‑testing tool from Tencent that uses a multi‑process, event‑driven architecture with connection reuse and lock‑free monitoring, supports extensible protocols via JSON test cases, and offers both single‑machine and cloud‑distributed testing, outperforming traditional tools like Apache Bench, Wrk, and JMeter.
TarsBenchmark is an open‑source microservice RPC framework benchmark tool originated by Tencent and donated to the Linux Foundation. It provides online load‑testing capabilities that significantly lower the entry barrier for developers and testers to evaluate service performance.
1. Common Load‑Testing Tools
Typical tools include Apache Bench (single‑threaded, limited on multi‑core servers), Wrk (event‑driven, multi‑threaded, supports Lua scripts), GHZ (gRPC‑focused, written in Go), and JMeter (Java‑based, GUI, distributed but thread‑heavy). Each has strengths and weaknesses regarding concurrency, protocol support, and resource consumption.
2. What is TarsBenchmark?
TarsBenchmark is built on the Tars ecosystem and primarily targets Tars services, though it can also test non‑Tars protocols. It offers both a single‑machine mode and a cloud‑based web platform for distributed testing, enabling developers to assess service capacity and concurrency limits.
3. Problems Solved by TarsBenchmark
High performance: fully utilizes multi‑core CPUs (e.g., 40 WTPs on an 8‑core machine).
High scalability: supports any Tars interface and can be extended to custom protocols.
Ease of use: test cases are written in JSON, facilitating online cloud testing.
4. Design Principles
4.1 Multi‑process Architecture – The master process forks a number of worker processes equal to the physical CPU cores (configurable), ensuring isolation and full CPU utilization.
4.2 Event‑Driven Network Handling – Uses a timer‑based packet sender and non‑blocking sockets to avoid I/O blocking.
4.3 Connection Reuse – Implements a connection pool with reuse, allowing continuous packet emission without waiting for server responses, which eliminates the performance drop seen in tools like Apache Bench.
4.4 Multi‑dimensional Monitoring – Workers communicate with the master via lock‑free queues to report latency, error codes, and other metrics.
4.5 Protocol Extension – A protocol‑proxy factory pattern enables adding new protocols; Tars is supported by default, and custom protocols can be integrated by implementing the required interfaces.
4.6 Random Data Generation – Supports random payload generation to avoid replay of identical requests.
4.7 Automatic Test‑Case Generation – Provides tools to generate JSON‑based test cases from Tars IDL files.
5. Distributed Load‑Testing Architecture
The platform consists of four components: a WebUI entry (integrated into TarsWeb), a CGI layer for test‑case management and permissions, an Admin service, and Node services that execute the actual load tests. Admin receives commands from CGI, distributes tasks to Nodes (which run as threads), and aggregates results.
6. Protocol Conversion
Tars uses a TLV‑based binary protocol. TarsBenchmark parses the IDL description file to map tags and types, then converts JSON test cases into binary buffers for transmission. The reverse conversion restores JSON from binary responses.
7. Supporting Third‑Party Services
By implementing four functions (initialization, packet splitting, encoding, decoding), developers can extend TarsBenchmark to test any custom protocol, such as Kafka.
8. Usage
8.1 Code Structure – Four main modules: tools, services, common modules (protocol, network, monitoring), and resources. The common module includes protocol support (HTTP and Tars), event‑driven networking, and monitoring via lock‑free queues.
8.2 Compilation – Clone the repository, run CMake, and build three binaries: tb (the client tool), nodeserver , and adminserver . For cloud testing, install the admin and node services into TarsWeb.
8.3 Single‑Machine Tool Options
-c : number of connections.
-s : maximum QPS limit (auto‑detect if omitted).
-D / -P : target server IPs and ports (comma‑separated).
-n : number of worker processes.
-T : protocol (TCP/UDP, default TCP).
-p : specify custom protocol name.
During a test, TarsBenchmark reports latency distribution, success rate, and percentile metrics (P99, P90) periodically.
9. Cloud Distributed Testing
Through the TarsWeb UI, users can create JSON test cases, specify target IPs and desired QPS, and launch distributed tests that automatically allocate nodes and collect results.
10. Q&A Highlights
Key takeaways include the superiority of TarsBenchmark over Wrk and JMeter in terms of raw performance, protocol extensibility, and steady packet emission; its lightweight C++ implementation (≈3‑4 k lines of code); and its design based on C++11 standards.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.