Backend Development 14 min read

Optimizing Product Service Performance through Data Reduction and Field Selection

This article examines performance bottlenecks in a high‑traffic e‑commerce product service and proposes data‑centric optimizations—including read‑only focus, field‑level selection via bit‑masking, and Redis hash storage—to reduce payload size, lower GC pressure, and improve latency while maintaining scalability.

Zhuanzhuan Tech

Apr 27, 2022

Optimizing Product Service Performance through Data Reduction and Field Selection

1. Overview

The product system is a core component of an e‑commerce platform, demanding high performance, high concurrency, and high availability. Beyond distributed caching and database sharding, this article explores data‑level optimizations to boost concurrency and performance.

2. Current State of the Product Service

The architecture follows a large‑scale middle‑platform with small business modules, allowing rapid iteration. The database is sharded into 16 databases with 16 tables each, and vertical splitting of product tables reduces index depth and lock contention. A distributed cache using the Cache‑Aside pattern further improves concurrency.

3. Background and Existing Problems

Growing business integration raises QPS, stressing the system. The platform supports multiple models (C2C, B2C, C2B, B2B, C2B2C), requiring a generic data model that often results in large product records. Three main contradictions arise:

Rising QPS versus high‑availability and performance.

GC pressure on callers versus large product data.

Cost of adding machines versus cost‑reduction goals.

The core challenge is delivering better, faster service at minimal cost.

4. Identifying Optimization Points

Guided by the contradictions, the optimization follows three principles:

Focus on the dominant read traffic (reads far exceed writes).

Analyze the complete read path for potential improvements.

Validate feasibility of each optimization.

The read path is a single RPC call that fetches product data from Redis, deserializes it, and returns it to the client.

5. Reducing Data Packet Size

Two key techniques are considered: data compression (limited room) and eliminating unnecessary data transmission. Many callers only need a subset of fields, suggesting that returning only required fields can shrink payloads.

6. Feasibility Analysis

Analysis shows most callers request far fewer fields than the service returns, often receiving large strings (e.g., product description) that are unused, adding unnecessary load.

7. Optimization Proposals

Proposal 1: Dedicated Interface for Top‑5 Callers

Provide a separate query interface for the top five callers (over 50% of traffic) that filters out unused fields.

Pros: Simple implementation by wrapping the existing interface.

Cons: Higher maintenance cost and no full‑chain invalid‑data filtering.

Proposal 2: Field‑Marking Method (GraphQL‑like)

Callers specify required fields, which may span multiple tables. The service then queries Redis or MySQL accordingly and returns only those fields.

Advantages:

Reduces custom interface development cost.

On‑demand query and response, cutting invalid data transfer and GC overhead.

Cross‑table field routing eliminates the need for multiple calls.

Drawback: Marking fields adds some request size.

Field Marking Implementation

Fields are represented by bits in a 64‑bit long value: the first 2 bits indicate the group, the remaining 62 bits represent up to 248 fields.

long status = 1;
long title = 1 << 1;

When a caller requests both status and title, the combined mask is: long result = status | title; A builder pattern simplifies mask construction:

BitProductFieldRepresentation fieldRepresentation = new BitProductFieldBuilder()
    .actTypeId()
    .infoType()
    .brandIdNew()
    .build();

On‑Demand Query Implementation

Redis

Current storage uses Redis String (full object). Switching to Redis Hash allows fetching only required fields, aligning with the field‑marking approach.

MySQL

MySQL hit rate is only ~1.5%; its latency is higher than Redis, so on‑demand MySQL queries are omitted to avoid added pressure.

Table Routing

When a field spans multiple tables, the system routes cache‑miss IDs to the appropriate table queues, retrieves data concurrently, and assembles the final response.

Extensibility

Using Spring’s BeanFactoryPostProcessor and BeanPostProcessor, additional extensions (e.g., parameter validation, bit‑mask parsing) can be plugged in without altering the main flow.

8. Optimization Effects

Four metrics were measured under a promotional scenario (TPS = 3500):

Client‑side GC: 547 GC events (1.74 s) → 176 GC events (0.561 s) (~3× improvement).

Server‑side GC: 10 YGC → 3 YGC (≈3× improvement).

Network traffic: 90.62 MB → 11.95 MB (~8× reduction).

Interface latency: average 1.17‑1.52 ms → 1.30 ms after optimization.

9. Conclusion

The article presents a data‑centric optimization of the product system, from analysis to concrete implementation, demonstrating significant reductions in GC pressure, network usage, and latency while maintaining scalability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Performance Redis data optimization field selection

Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.