Cutting Invalid Data: How Zhaunzhuan Optimized Its Product Service for 3× Faster Performance

This article examines how Zhaunzhuan's product service, a core component of its e‑commerce platform, was optimized by reducing unnecessary data transmission, applying cache‑aside patterns, redesigning Redis storage, and introducing a field‑marking approach, resulting in dramatically lower GC overhead, network traffic, and response times.

dbaplus Community
dbaplus Community
dbaplus Community
Cutting Invalid Data: How Zhaunzhuan Optimized Its Product Service for 3× Faster Performance

Overview

The product system is a critical part of an e‑commerce platform, demanding high performance, concurrency, and availability. Beyond distributed caching and sharding, the article focuses on data‑level optimizations to boost concurrency and overall performance.

Current Architecture

Zhaunzhuan adopts a large‑mid‑small business model, with the product system serving as the core of the business‑mid platform. The database is split into 16 databases, each with 16 tables, and vertical sharding reduces index depth and lock contention. A cache‑aside pattern is used for distributed caching.

Problems Identified

Rapid business growth leads to soaring QPS, diverse business models (C2C, B2C, etc.), and large product records. Three contradictions emerge: high QPS vs. high availability, GC pressure vs. large data, and cost reduction vs. performance improvement. The core challenge is delivering better, faster service at minimal cost.

Optimization Focus

Grab the big, drop the small: Optimize read‑heavy paths since reads dominate calls.

Path analysis: Every point in a product‑read RPC chain is a potential optimization spot.

Feasibility analysis: Validate each optimization point for impact.

The product‑read RPC flow is illustrated below:

Reducing payload size by filtering unused fields can lower serialization/deserialization cost and network bandwidth.

Feasibility Analysis of Reducing Invalid Data

Analysis of several callers shows they request far fewer fields than the API returns, leading to unnecessary data transfer.

Optimization Schemes

Scheme 1 – Dedicated Query Interface for Top‑5 Callers

Provide a separate API for the top‑5 callers (over 50% of traffic) that filters out unused fields.

Why only top‑5? Because they account for >50% of calls, balancing cost and benefit.

Will it meet expectations? It reduces GC, traffic, and latency but introduces tighter coupling with callers.

Scheme 2 – Field‑Marking Request Method (GraphQL‑like)

Callers specify required fields, either by name or by a compact bit‑mask, allowing the service to return only needed data.

Implementation steps:

Field‑marking overview: Use a 64‑bit long where the first 2 bits indicate the group and the remaining 62 bits represent fields, supporting up to 248 fields.

Bit‑mask example:

long status = 1;</code><code>long title = 1 <<< 1;

Combine masks: long result = status | title; Builder pattern simplifies mask creation:

BitProductFieldRepresentation fieldRepresentation = new BitProductFieldBuilder()
    .actTypeId()
    .infoType()
    .brandIdNew()
    .build();

On‑Demand Query Implementation

Redis side: Switch from String (full object) to Hash, enabling field‑level retrieval.

Mysql side: Low hit rate (1.5%) and slower performance make on‑demand queries unnecessary.

Table Routing and Extensibility

Cross‑table field requests are routed based on cache hits; missing data is fetched from the appropriate table queues. Extension points follow Spring’s BeanPostProcessor model, allowing custom validation and bit‑mask parsing without altering core logic.

Optimization Results

Four metrics were measured under a promotional scenario (TPS = 3500): client GC count & time, server GC count & time, network traffic, and API latency.

Client GC

Before: 547 GC events, 1.74 s total.

After: 176 GC events, 0.561 s total.

≈ 3× improvement.

Server GC

Before: 10 YGC events.

After: 3 YGC events (peak 4, low 2).

≈ 3× improvement.

Network Traffic

Before: 90.62 MB/s.

After: 11.95 MB/s.

≈ 8× improvement.

API Latency

Before (three interfaces): 1.17 ms, 1.52 ms, 1.23 ms.

After (single optimized interface): 1.30 ms average.

Overall performance gains are significant while maintaining functional correctness.

Conclusion

By analyzing data flow, eliminating invalid field transmission, and adopting on‑demand queries with bit‑mask field selection, Zhaunzhuan achieved substantial reductions in GC overhead, network usage, and response time, demonstrating a practical approach to high‑performance backend service optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancedata optimizationGraphQLbitmask
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.