Scaling Automated API Testing for Millions of Microservices

This article outlines the background, testing strategy, and practical implementation of automated API testing within a large-scale microservice environment, detailing the shift from traditional test pyramids to a honeycomb model, technology choices, test case design, mock servers, platform management, and measures to prevent test suite decay.

SQB Blog
SQB Blog
SQB Blog
Scaling Automated API Testing for Millions of Microservices

1. Background

ShouQianBa business serves millions of merchants, with a massive microservice architecture comprising hundreds of backend services written in Node.js, Java, Go, Python, using MySQL, MongoDB, Elasticsearch, Kafka, Redis, Apollo, RabbitMQ, etc. As product complexity grew, traditional functional testing became costly and inefficient, prompting the adoption of automated testing to detect deep issues early and reduce fix time.

2. Testing Strategy

Automated testing is a generic term that includes unit testing, API testing, web testing, etc. In a microservice architecture, instead of the traditional test pyramid, we favor a honeycomb layered model.

Honeycomb model diagram
Honeycomb model diagram

Reasons:

In microservice projects, a service is a "unit"; interfaces expose unit capabilities and enable communication; orchestrating interface calls implements business logic.

Unit tests are large in code volume, developers cannot maintain them, and they do not cover integration or business scenarios, yielding limited benefit.

When interfaces are defined early, test engineers can design and develop API test cases early, achieving left‑shift testing and earlier issue detection.

API automation testing offers early involvement, low maintenance cost, and comprehensive business logic coverage, making it our primary focus.

2.1 Refined Testing Strategy for Microservices

Beyond emphasizing API testing, we further refine the automation strategy according to the layered characteristics of microservices.

System architecture overview:

System architecture diagram
System architecture diagram

Access Layer : Front‑end entry point (e.g., API gateway) handling authentication, validation, response packaging, and routing without business logic.

Application Layer : Business services that orchestrate domain services to implement functionality (e.g., merchant onboarding).

Domain Layer : Domain objects with high cohesion and low coupling, implementing business rules (e.g., payment, card, settlement services).

Infrastructure Layer : Databases, caches, message queues, etc.

We simplify the view to the access perspective, as shown in the following diagram:

Simplified access view diagram
Simplified access view diagram

Based on each layer’s responsibilities, we define testing focus:

Access Layer: interface authentication, validation, input/output legality, connectivity, routing correctness.

Application Layer: functional coverage of each interface and end‑to‑end business flow through integrated interface testing.

Domain Layer: business rule, algorithm, and third‑party interaction coverage.

3. Automated Testing Practice

With a clear strategy, we develop automated test cases.

3.1 Technology Stack

We use Python with the built‑in unittest framework and pytest for execution.

Test data object management and relationship handling

Data‑driven testing support

Multi‑dimensional test case sorting for organizing test plans

Multi‑environment execution

Middleware connection pooling

Automated report integration

Platform integration

3.2 API Functional Testing

Key aspects include parameter validation, permission checks, response verification, persistence validation, middleware validation, and logic branch coverage. Effective equivalence class partitioning reduces test effort. Example: login interface covered by two cases—correct credentials and incorrect credentials—based on the implementation that only a matching username/password succeeds.

def test_login_succ(self):
    res = self.client.login(username, password)
    self.assertEqual(res.code, SUCC)

3.3 API Integration Testing

Beyond single‑interface tests, we verify interactions and data flow across services, covering normal business flows, exception flows, call‑chain coverage, and sequence coverage. Example scenario: merchant enrollment, activity registration, payment source onboarding, rate discount, and successful payment.

def test_new_merchant_pay_success(self):
    res = self.merchant_service.create_merchant()
    self.assertEqual(res.code, SUCC)

    another_res = self.another_service.do_something(res.field)
    self.assertTrue(another_res, COMPLETE)

3.4 Asset‑Loss Testing

Financial operations (payment, settlement, profit sharing, etc.) require tests to prevent monetary loss due to duplicate submissions, concurrency issues, logic errors, or security flaws. Three scenarios are illustrated:

3.4.1 Duplicate Calls

Idempotent handling ensures only one deduction despite repeated requests.

Test simulates multiple identical payment requests and asserts balance changes reflect a single successful deduction.

def test_pay_more_time():
    old_balance = self.client.get_merchant_balance(merchant_id)  # get balance
    [self.client.pay(amount, client_sn) for _ in range(5)]      # multiple payments
    current_balance = self.client.get_merchant_balance(merchant_id)  # re‑get balance
    assert current_balance == old_balance + amount  # only one cent added

3.4.2 Concurrent Requests to the Same Interface

Concurrent identical payment requests must result in a single deduction.
def test_pay_concurrency(self):
    old_balance = self.client.get_merchant_balance(merchant_id)
    pool = [threading.Thread(target=self.client.pay, args=(1,)) for _ in range(10)]
    [t.start() for t in pool]
    [t.join() for t in pool]
    current_balance = self.client.get_merchant_balance(merchant_id)
    assert current_balance == old_balance + 1

3.4.3 Concurrent Calls to Multiple Financial Interfaces

Simultaneous add, reduce, and refund operations on the same account must preserve balance consistency.
def test_account():
    old_balance = self.client.get_merchant_balance(merchant_id)
    concurrent_add(self.client, add_amount, a)
    concurrent_reduce(self.client, reduce_amount, b)
    concurrent_refund(self.client, refund_amount, c)
    current_balance = self.client.get_merchant_balance(merchant_id)
    assert current_balance == old_balance + a*add_amount - b*reduce_amount - c*refund_amount

3.5 Mocking Third‑Party Interfaces

We built a Go‑based Mock Server to simulate third‑party responses, enabling testing of failure scenarios such as account errors or payment refusals.

@mock("RETURN_CODE", "ACCOUNT_ERROR")
def test_pay_abnormal():
    response = self.client.pay()
    self.assertEqual(response.code, FAIL)

4. Platform Management

4.1 Automated Execution Platform

We developed Zepar, a generic platform for test case presentation, plan management, and automated/manual execution, improving test case reuse and cross‑team accessibility.

Zepar platform
Zepar platform

Test plan management

Test plan UI
Test plan UI

Execution records

4.2 Reporting Platform

We integrated the open‑source ReportPortal to collect logs, analyze results, and visualize metrics across frameworks such as TestNG, Pytest, JUnit, etc., helping identify flaky tests and accelerate remediation.

ReportPortal dashboard
ReportPortal dashboard
ReportPortal analysis
ReportPortal analysis

5. Anti‑Decay Measures

To avoid test suite decay, we enforce PEP‑8 style, AIR principles, comprehensive docstrings, code review, decorator usage to hide technical details, and encourage Pythonic code. All test engineers are responsible for maintaining the usability of their automation assets.

6. Summary and Outlook

Our automated API test suite now exceeds 30,000 cases with over 95% availability and more than 70% coverage of core services. As test volume and asynchronous scenarios grow, we are developing a distributed execution framework to sustain performance.

Automated testing has become a vital asset for accelerating delivery while ensuring quality, and we will continue to refine and expand our capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonautomated testingtest automationcontinuous integrationAPI testing
SQB Blog
Written by

SQB Blog

Thank you all.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.