Preventing Cache Penetration and Syncing Redis with MySQL at Massive Scale

This article explains why cache penetration can cripple ultra‑large systems, how storing full data in Redis eliminates the risk, and presents two reliable cache‑update strategies—message‑queue‑driven updates and real‑time MySQL Binlog subscription via Canal—complete with configuration steps and Java code examples.

dbaplus Community
dbaplus Community
dbaplus Community
Preventing Cache Penetration and Syncing Redis with MySQL at Massive Scale

Cache Penetration in Ultra‑Large Systems

When request volume is massive, even a tiny fraction of reads that bypass the cache can overload the database and cause a cascading outage. Scaling Redis clusters alone does not solve this problem if cache misses still reach the DB.

Solution 1 – Cache the Full Dataset in Redis

If memory cost is acceptable, load the entire data set into a Redis cluster. All read operations hit Redis, eliminating cache‑penetration risk. The remaining challenge is keeping Redis consistent with the source database.

Solution 2 – Update Cache via Message Queue

For services that already emit change events (e.g., an order service), run a dedicated cache‑update service that subscribes to the MQ, consumes order‑change messages, and refreshes the corresponding Redis entries. This adds negligible development overhead and does not require changes to the core service.

Solution 3 – Real‑Time Cache Updates via MySQL Binlog (Canal)

If no change‑event stream exists, a cache‑update service can act as a MySQL slave, read the binary log (Binlog), parse entries, and update Redis asynchronously. This approach is more generic but requires Binlog parsing.

Setting Up Canal for Binlog Subscription

Download and extract Canal 1.1.4

wget https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.deployer-1.1.4.tar.gz
 tar zvfx canal.deployer-1.1.4.tar.gz

Enable Binlog in MySQL

[mysqld]
log-bin=mysql-bin
binlog-format=ROW
server_id=1

Create a Canal user with replication privileges

CREATE USER canal IDENTIFIED BY 'canal';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
FLUSH PRIVILEGES;

Record current Binlog file and position, then configure instance.properties

canal.instance.gtidon=false
canal.instance.master.address=127.0.0.1:3306
canal.instance.master.journal.name=binlog.000009
canal.instance.master.position=155
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
canal.instance.connectionCharset=UTF-8
canal.instance.defaultDatabaseName=test
canal.instance.filter.regex=.*\..*

Start Canal

canal/bin/startup.sh

Canal opens port 11111 for clients. Clients pull batches of Binlog entries, apply them to Redis, and acknowledge success. On failure the client rolls back, guaranteeing ordered, loss‑less processing.

Java Example: Updating an Account Balance Cache

The program continuously pulls messages from Canal, processes each entry, and updates Redis accordingly.

while (true) {
    Message message = connector.getWithoutAck(batchSize);
    long batchId = message.getId();
    try {
        int size = message.getEntries().size();
        if (batchId == -1 || size == 0) {
            Thread.sleep(1000);
        } else {
            processEntries(message.getEntries(), jedis);
        }
        connector.ack(batchId);
    } catch (Throwable t) {
        connector.rollback(batchId);
    }
}

Processing logic for INSERT, UPDATE, DELETE events:

for (CanalEntry.RowData rowData : rowChange.getRowDatasList()) {
    if (eventType == CanalEntry.EventType.DELETE) {
        jedis.del(row2Key("user_id", rowData.getBeforeColumnsList()));
    } else if (eventType == CanalEntry.EventType.INSERT) {
        jedis.set(row2Key("user_id", rowData.getAfterColumnsList()),
                 row2Value(rowData.getAfterColumnsList()));
    } else { // UPDATE
        jedis.set(row2Key("user_id", rowData.getAfterColumnsList()),
                 row2Value(rowData.getAfterColumnsList()));
    }
}

Test the flow by inserting a record into account_balance and verifying the Redis entry.

INSERT INTO account_balance VALUES (888, 100, NOW(), 999);
127.0.0.1:6379> GET 888
{"log_id":"999","balance":"100","user_id":"888","timestamp":"2020-03-08 16:18:10"}

Full example code is available at https://github.com/liyue2008/canal-to-redis-example

Conclusion

For massive concurrent workloads, caching the entire dataset in Redis eliminates cache‑penetration‑induced DB overload. Cache consistency can be maintained either by consuming business‑level change messages or by masquerading as a MySQL slave and subscribing to Binlog via Canal. Both approaches require careful handling of reliability and latency, and production deployments should include fallback or compensation mechanisms for potential data loss or lag.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ScalabilityredisMessage QueueMySQL BinlogCanalcache-penetration
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.