Backend Development 8 min read

Improving Testability in a High‑Throughput Push Notification System

This article describes how a high‑throughput push notification system was enhanced with a dedicated testability metric, introducing tools such as a test‑sending UI, dual synchronous‑asynchronous API handling, a reachability diagnostic utility, and end‑to‑end unique‑ID logging to streamline debugging and reduce maintenance costs.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Improving Testability in a High‑Throughput Push Notification System

1. Background

The push notification system was originally designed around three core metrics: throughput (messages per unit time), scalability (ease of supporting more apps, strategies, and channels), and performance (API latency). To also improve developer experience, a fourth metric—testability—was added to reduce testing effort and maintenance cost.

2. Problems and Solution Ideas

2.1 Testing Problems

Lack of a test‑sending tool.

Asynchronous API hides result and failure information from callers.

Unstable test environment and many processing nodes make issue localisation difficult.

Large production data volume increases debugging cost.

Business teams need to quickly distinguish whether a failure originates from API misuse, environment instability, or rule restrictions, and then locate the root cause.

2.2 Solution Ideas

Focus on the testing stage as the primary problem scenario.

Provide synchronous responses in the test environment because performance is not critical there.

Keep asynchronous execution for production, but allow synchronous handling for lightweight parameter validation.

Add internal data‑query APIs that can automatically analyse and pinpoint issues.

Introduce a unique ID that threads through all log entries, enabling end‑to‑end traceability.

3. Testability Design

3.1 Test Sending Tool

A UI‑based test‑sending tool wraps the API, allowing users to send test messages and preview results on the app, simplifying the verification process.

3.2 Dual Synchronous/Asynchronous API Execution

The API adopts a "synchronous basic‑parameter validation + synchronous/asynchronous logic processing" model. Basic validation is fast and handled synchronously, giving immediate feedback. The core business logic runs synchronously in the test environment but remains asynchronous in production to preserve performance.

Instead of exposing a switch to callers, the system internally decides the execution mode to avoid misuse that could degrade production performance.

Response payloads also include diagnostic information such as the execution machine IP and node name, helping operators locate problems quickly.

3.3 Reachability Diagnosis Tool

The diagnosis tool lets users input a UID or token and a PushCode to query the system. It returns channel information, rule restrictions, and ultimately whether the message can be sent, helping pinpoint environment‑ or rule‑related issues.

3.4 End‑to‑End Unique ID and Log Tracing

Each message is assigned a globally unique ID at the source. This ID propagates through every processing node, and critical nodes emit trace logs (including embedded analytics data). The approach enables rapid location of a specific message’s log chain among massive traffic.

4. Conclusion

The presented testability designs are not universal solutions but case‑by‑case analyses that effectively address the specific challenges of the push system. Early incorporation of testability considerations can dramatically lower long‑term testing and maintenance costs.

Author: Xie Liqi, Head of C2C Business R&D at Zhuanzhuan, responsible for C2C business and the push system.

backend designpush notificationsloggingAPIDiagnosticstestability
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.