Databases 12 min read

Designing a Scalable Persistence Layer: Sharding, Query Optimization, and Read/Write Splitting

This article shares a practical case study on how a coupon‑distribution system evolved from a single‑table design to a sharded, multi‑database architecture with query optimization, ElasticSearch integration, and read/write splitting to handle growing data volume, complex queries, and increasing request traffic.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Designing a Scalable Persistence Layer: Sharding, Query Optimization, and Read/Write Splitting

1 Introduction

In most development scenarios a single database and table can satisfy product requirements, but as business matures data volume and request load increase, a single‑table approach may no longer guarantee system stability. This article shares the author’s experience in building a stable persistence layer during project iterations.

2 Project Overview

The project allows users to obtain coupon information in various activity scenarios, claim and bind coupons to a relationship table, and later select an appropriate coupon when purchasing goods.

3 Problems Faced

3.1 Data Volume Is Growing

Initially a single table was sufficient, but low coupon issuance thresholds caused the number of coupons and the size of the user‑coupon relationship table to explode, prompting a decision to split the relationship table.

3.1.1 Technology Selection

Three main approaches for sharding and splitting are common:

1. JDBC‑based proxy – no DBA or ops involvement, fully developer‑controlled.

2. Database‑level proxy – requires DBA or ops, harder to maintain.

3. TiDB – unlimited horizontal scaling with strong consistency, but the author lacks deep familiarity.

Final Choice The author chose the JDBC proxy approach for its internal manageability and lower development cost.

Framework: ShardingJDBC, selected because the community is active, the framework is mature, and it is widely used internally.

Sharding Design Based on current growth, eight databases with eight tables each are sufficient; the user ID (uid) is used as the sharding key.

Handling Faster‑Than‑Expected Growth If growth exceeds expectations, coupons are archived after expiration to keep data volume stable.

Why Not TiDB? Limited knowledge of TiDB and concerns about troubleshooting critical business data led to its exclusion.

3.1.2 Data Migration Process

The migration consists of three steps:

1. Delayed Dual‑Write Write to the old table first, then to the new table, and send a delayed message to verify consistency.

2. Data Cleansing Extract primary keys before a cut‑over point and replay them into the new table via a script.

3. Asynchronous Error Correction After migration, periodically compare old and new data and fix discrepancies.

3.2 Queries Are Becoming More Complex

3.2.1 Initial Solution

Coupon query conditions are complex (array queries, fuzzy matches) and kept in a JSON field, which prevents direct MySQL filtering. Early on, the entire configuration table was loaded into memory for filtering, then matching IDs were used to query the relationship table.

3.2.2 Temporary Improvement

To reduce unnecessary database reads, coupons are cached in memory and filtered there, lowering GC pressure. A push‑pull cache pattern (real‑time broadcast + periodic pull) is used.

This approach is not sustainable as coupon count grows, so a more robust middleware is considered.

3.2.3 Integrating ElasticSearch

ElasticSearch was chosen as the query middleware. Its write‑to‑search latency is mitigated by synchronizing the coupon ID to Redis (zset) with a 10‑second TTL. Queries combine results from ElasticSearch and Redis to select the best coupon.

Performance Optimizations 1. Return only required fields. 2. Use appropriate fields for indexing. 3. Archive old data to limit total volume. 4. Route queries by uid to narrow index ranges.

3.3 Request Volume Is Increasing

3.3.1 Read‑Write Splitting

High QPS during promotions caused primary‑database overload, so read‑write splitting was introduced to relieve pressure.

Master‑Slave Lag Solution A simple approach is to expose a master‑only query API, but this forces callers to change and may defeat the purpose of splitting.

Object getByInfoFromMater(Long id);

Ideal Solution A middleware that routes reads to the slave only after successful master‑slave sync, otherwise reads from the master.

Since such middleware was unavailable, a Redis‑based simplified solution was implemented:

– Methods annotated for writes always read from the master and set a ~2‑second tag in Redis. – Methods annotated for reads check the tag; if present, they read from the master, otherwise from the slave.

This introduces a minor overhead (extra Redis lookup) and a tiny risk of inconsistency within the tag window, which is acceptable for the business.

4 Conclusion

When building a project from 0 to 1, avoid over‑design; launch quickly and ensure basic stability. Address problems as they arise, but once traffic grows, proactively identify pain points and plan solutions before they become critical.

The author welcomes comments and alternative solutions.

About the Author

Wang Ruigang, backend development engineer for ZhiZhi’s online recycling business.

backendscalabilityElasticsearchShardingRedisRead-Write Splittingdatabases
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.