Backend Development 5 min read

Why Master‑Slave Sync Delays Break Your Order Cache and How to Fix It

An order system retrieved incorrect product size data due to master‑slave database lag causing stale Redis cache, and the article explains the original flawed flow, the revised solution that writes to cache during sync, and best practices for cache updates, read‑through, and QA testing.

Vipshop Quality Engineering

Jul 11, 2017

Why Master‑Slave Sync Delays Break Your Order Cache and How to Fix It

Problem Overview

One week ago the Order system fetched product size information from LPC and received abnormal data. Although the information was synchronized from WMS, the issue stemmed from master‑slave database inconsistency, causing the cache not to update promptly.

Original Design

When data updates occur, the Master first syncs to the Slave, then writes the data to Redis from the Slave. If Redis lacks the cache entry, a fallback read occurs: according to our read‑write separation rule, Redis queries the Slave, retrieves the data, and caches it.

Read‑write separation: the primary database handles transactional INSERT/UPDATE/DELETE operations, while the replica handles SELECT queries. This improves performance but can cause inconsistency when replication lag exists.

This design suffers from replication delay; if the product size data has not yet synced to the Slave, the Order system reads from Redis and fails to obtain the data.

Revised Design

In the improved flow, the Master writes the updated data to Redis concurrently while syncing to the Slave. For cache misses, the same fallback reads from the Slave and caches the result.

When modifying the DB, the cache must be updated simultaneously.

If the cache cannot provide data, a fallback read must retrieve the information from the DB.

The original approach neglected cache updates during DB changes, leading to data anomalies.

Quality Assurance Considerations

QA should understand the caching mechanism, collaborate with architects and developers to review design, and ensure cache warm‑up, update, and fallback mechanisms are reliable.

From a project‑management perspective, include cache warm‑up plans in the release schedule, especially for large data synchronizations.

For the DB‑to‑cache core logic, QA must enforce unit tests for data consistency, conduct code reviews, and design comprehensive test cases.

Extended: Redis Usage Scenarios

Typical pattern: check Redis first; if data exists, return it directly. If not, read from the DB, then cache the result in Redis.

Suitable when data volume is large but updates are infrequent.

Another scenario treats Redis as a primary data store to achieve faster reads than the DB.

Ideal for large, frequently changing datasets.

Drawback: heavy reliance on Redis; robust persistence strategies are required for outages.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend-architecture Redis Read‑Write Separation master‑slave replication Cache Invalidation

Written by

Vipshop Quality Engineering

Technology exchange and sharing for quality engineering

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.