Comprehensive Guide to Canal: Spring Boot Integration for Elegant Cache Consistency
This article explains Canal's background, core functions, deployment steps, and component architecture, then shows how to integrate it with Spring Boot to monitor MySQL binlog events and keep Redis cache synchronized with database changes, while also covering Canal‑admin management and limitations.
1. Background
Canal (pronounced kə'næl) means "waterway" and is used for incremental MySQL binlog parsing, providing data subscription and consumption. Its core functions are:
Real‑time data backup
Incremental sync between heterogeneous sources (e.g., Elasticsearch, HBase) and databases
Business cache refresh to ensure cache consistency
Business‑logic‑driven incremental data processing
Canal relies on MySQL master‑slave replication. The replication process consists of three steps:
Master records changes to the binary log (binlog events).
Slave copies these events to its relay log.
Slave replays the relay log to reflect the changes.
Canal simulates a MySQL slave:
It pretends to be a MySQL slave and sends a dump request to the master.
The master pushes the binlog to Canal.
Canal parses the binary log byte stream.
Canal Component Architecture
A server represents a Canal JVM instance. An instance corresponds to a data queue; one server can host multiple instances.
eventParser : Handles source connection and slave‑protocol interaction.
eventSink : Links parser and store, performs filtering, processing, and distribution.
eventStore : Stores data.
metaManager : Manages subscription and consumption metadata.
2. Canal Deployment and Installation
MySQL must enable binlog:
[mysqld]
log-bin=mysql-bin # enable binlog
binlog-format=ROW # use ROW mode
server_id=1 # unique server ID, avoid conflict with Canal slaveIdCreate a MySQL user with slave privileges:
CREATE USER canal IDENTIFIED BY 'canal';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
FLUSH PRIVILEGES;Download Canal 1.1.5, extract, and examine the conf directory (contains canal.properties, example, logback.xml, etc.). Sample canal.properties:
canal.destinations = example
canal.conf.dir = ../conf
canal.auto.scan = true
canal.auto.scan.interval = 5
canal.auto.reset.latest.pos.mode = false
canal.instance.global.mode = spring
canal.instance.global.lazy = false
canal.instance.global.manager.address = ${canal.admin.manager}
canal.instance.global.spring.xml = classpath:spring/file-instance.xmlTo listen to multiple databases, copy the example folder, rename it (e.g., product, warehouse), and adjust canal.destinations:
canal.destinations = product,warehouseEach instance has its own instance.properties (e.g., set MySQL address, binlog name, position, credentials, and table filter regex).
canal.instance.master.address=127.0.0.1:3306
canal.instance.master.journal.name=mysql-bin.000001
canal.instance.master.position=154
canal.instance.dbUsername=canal
canal.instance.dbPassword=Canal@123456
canal.instance.filter.regex=.*\..*Start the server:
sh bin/startup.shBecause each new instance requires config changes and a restart, Alibaba provides canal-admin for centralized management. canal-admin requirements:
MySQL for storing config and node data
Canal version >= 1.1.4 (provides admin APIs)After extracting the admin package, configure application.yml (port, datasource, admin credentials) and run:
sh bin/startup.shAccess the Web UI at host:8089 (default login admin/123456). The UI allows managing clusters, servers, and instances without restarting the server; new instances appear as directories under conf (e.g., mall, fast-api).
3. Spring Boot Integration for Cache‑Database Consistency
3.1 Database‑Cache Consistency Problem
When both DB and Redis are used, the order of writes matters under high concurrency. Writing DB first then Redis can cause stale cache values if a later request updates the DB while an earlier request is delayed on Redis, and the reverse order suffers the same issue. The article mentions other strategies (delete‑then‑write, delayed double‑delete) but focuses on using Canal to keep cache in sync.
3.2 Spring Boot Integration
Canal does not provide an official Spring Boot starter, so the client connects directly to the Canal server.
Maven dependency:
<dependency>
<groupId>com.alibaba.otter</groupId>
<artifactId>canal.client</artifactId>
<version>1.1.4</version>
</dependency>Example Java client (simplified):
package com.shepherd.common.canal;
import com.alibaba.fastjson.JSONObject;
import com.alibaba.otter.canal.client.CanalConnector;
import com.alibaba.otter.canal.client.CanalConnectors;
import com.alibaba.otter.canal.protocol.CanalEntry;
import com.alibaba.otter.canal.protocol.Message;
import com.google.protobuf.ByteString;
import com.google.protobuf.InvalidProtocolBufferException;
import java.net.InetSocketAddress;
import java.util.List;
public class CanalClient {
public static void main(String[] args) throws InterruptedException, InvalidProtocolBufferException {
// Create a single‑connector client
CanalConnector canalConnector = CanalConnectors.newSingleConnector(
new InetSocketAddress("10.10.0.10", 11111), "mall", "", "");
canalConnector.connect();
while (true) {
// Pull up to 100 entries
Message message = canalConnector.get(100);
List<CanalEntry.Entry> entries = message.getEntries();
if (entries.size() <= 0) {
Thread.sleep(1000);
} else {
for (CanalEntry.Entry entry : entries) {
String tableName = entry.getHeader().getTableName();
CanalEntry.EntryType entryType = entry.getEntryType();
ByteString storeValue = entry.getStoreValue();
if (CanalEntry.EntryType.ROWDATA.equals(entryType)) {
CanalEntry.RowChange rowChange = CanalEntry.RowChange.parseFrom(storeValue);
CanalEntry.EventType eventType = rowChange.getEventType();
List<CanalEntry.RowData> rowDataList = rowChange.getRowDatasList();
for (CanalEntry.RowData rowData : rowDataList) {
JSONObject beforeData = new JSONObject();
for (CanalEntry.Column column : rowData.getBeforeColumnsList()) {
beforeData.put(column.getName(), column.getValue());
}
JSONObject afterData = new JSONObject();
for (CanalEntry.Column column : rowData.getAfterColumnsList()) {
afterData.put(column.getName(), column.getValue());
}
System.out.println("Table:" + tableName + ",EventType:" + eventType + ",Before:" + beforeData + ",After:" + afterData);
}
}
}
}
}
}
}Console output example shows the table name, event type, and before/after column values for INSERT, UPDATE, and DELETE events.
By listening to MySQL binlog changes, developers can update or invalidate Redis entries in real time, achieving an elegant cache‑DB consistency solution.
4. Summary
Canal is an incremental data‑sync component that works by masquerading as a MySQL slave to collect binlog events, offering non‑intrusive data synchronization. It supports only MySQL; other databases like Oracle are not covered. For large‑scale heterogeneous sync (e.g., to Elasticsearch or HBase), batch processing and middleware such as Kafka are recommended, forming common architectures like canal + kafka + elasticsearch.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Shepherd Advanced Notes
Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
