Operations 14 min read

How Havok Enables Realistic Full‑Link Load Testing for Scalable Services

This article explains the background, design, and core components of Havok—a full‑link load‑testing platform that replays production logs, supports traffic scaling, mock services, real‑time monitoring, isolation, and circuit‑breaker protection—to help enterprises evaluate capacity and improve reliability without polluting live data.

dbaplus Community

Jun 15, 2022

How Havok Enables Realistic Full‑Link Load Testing for Scalable Services

Background

Rapid growth of transaction volume caused occasional production failures, prompting questions such as why tests still fail after extensive validation, whether the system can handle upcoming promotional traffic, and how to reduce IT costs without sacrificing performance. The industry‑standard answer is full‑link load testing based on replaying real production traffic.

Solution Overview

Traditional online load testing builds large test data sets, injects traffic through a single Nginx instance, and suffers from time‑consuming data preparation, dirty data that pollutes production databases, manually crafted test models, narrow coverage, and inability to include infrastructure components (SLB, Nginx, network, databases). Havok was designed to overcome these limitations with five core capabilities:

Realistic replay of user behavior without contaminating production data.

Rate‑based and multiplier‑based traffic amplification for capacity probing.

Instant “out‑of‑the‑box” testing without pre‑building data.

Support for HTTP, internal RPC, and mobile‑specific protocols.

Real‑time monitoring and automatic overload protection.

System Architecture

Havok replays production service logs, preserving both read and write operations and controlling request pacing using timestamps. The architecture consists of four main services:

Havok‑dispatcher (Scheduling Center) : downloads, sorts, filters, and dispatches logs; applies gain (amplification) rules; collects engine metrics.

Havok‑replayer (Load Engine) : replays dispatched requests, supports gain adjustments and rule‑based request modification.

Havok‑monitor (Monitoring Platform) : aggregates metrics from the load engine, services, and middleware, and visualizes them.

Havok‑mock (Mock Service) : provides mock endpoints with configurable latency jitter.

Havok‑canal (Data Construction) : incremental data cleaning and offset handling based on Alibaba Canal.

Main Module Functions

1. Scheduling Center

Extracts logs from multiple sources, applies dimension filtering, preserves order, and dispatches requests with configurable gain. Example: an order API POST /api/order with varying merchant and dish IDs is automatically reconstructed from logs, eliminating manual scenario construction.

2. Load Engine

Deployed as distributed containers for rapid scaling.

Uses Go goroutine for asynchronous request handling, providing low‑overhead context switches, small memory footprint (2 KB stack), and G‑M‑P scheduling.

Supports request/response field filtering, custom assertions, and rule‑based data offset.

Collects interface‑level statistics (error rate, QPS, P95, etc.) and reports them to the dispatcher.

Handles start/stop/flow‑control commands from the dispatcher.

3. Data Construction (Havok‑canal)

Built on Alibaba Canal for incremental synchronization. Sensitive fields (phone numbers, IDs) are deterministically transformed (prefixes, random strings, UUIDs) to create shadow data. This enables on‑demand testing without lengthy data‑generation phases.

4. Mock Third‑Party Services

DeepMock (https://github.com/wosai/deepmock) injects latency jitter and applies statistical adjustments so that mock behavior closely mirrors production lifecycles.

5. Load Monitoring

Pressure‑side monitoring aggregates per‑interface metrics (error rate, QPS, top‑90/95 latency, average latency) and pushes them to the dispatcher. Service‑side monitoring leverages existing cloud observability tools for middleware and infrastructure metrics.

6. Load Isolation

Each request carries a key:value identifier stored in the request context. Downstream services propagate this context, allowing selective handling, isolation, or routing to shadow tables without affecting real users.

7. Data Isolation

Different storage back‑ends use tailored strategies:

MySQL / MongoDB : shadow tables with offset rules (prefixes, random strings, UUIDs, reversal).

Redis : key offset; keys are removed after testing.

Kafka / MQ : either discard during test or pass through with tags for consumer‑side handling.

Other stores (e.g., Elasticsearch) : dedicated test clusters.

8. Circuit‑Breaker Protection

Pressure‑side : Havok monitors metrics against configurable thresholds and automatically reduces QPS or stops the test.

Service‑side : Built‑in circuit breakers in middleware trigger on error‑rate thresholds.

Implementation and Open Source

Core business lines (store‑code payment, scan‑code payment, mini‑program payment) have been integrated with Havok. The project is open‑sourced at https://github.com/wosai/havok, inviting community contributions.

Summary & Outlook

Havok progressed from design to production with cross‑team collaboration. Future work includes improving usability through visual tooling, simplifying developer operations, and extending capacity‑planning and SLA‑building capabilities such as cost optimization and chaos‑testing integration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance monitoring load testing Data Isolation circuit breaker full-link testing mock services

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.