Installing and Configuring Alibaba Canal for MySQL Binlog Capture
This guide explains how to download, install, and configure Alibaba Canal—including extracting the package, setting up canal.properties, instance.properties, and instance.xml files, and tuning key parameters—to enable reliable MySQL binlog capture for big‑data pipelines.
Installation
Download the Canal release package from the official GitHub repository and extract it to a fixed directory.
tar -zxvf canal.kafka-1.1.0.tar.gzConfiguration Files
canal.properties defines global settings such as instance list, configuration directory, auto‑scan options, and network parameters.
Parameter
Description
Default
canal.destinations
List of instance names deployed on the server
None
canal.conf.dir
Path to the conf/ directory
../conf
canal.auto.scan
Enable automatic scanning of instance configurations
true
canal.auto.scan.interval
Scanning interval in seconds
5
canal.instance.global.mode
Global configuration loading mode
spring
instance.properties contains instance‑specific settings such as MySQL connection details, charset, and filtering rules.
# Database address
canal.instance.master.address=192.168.56.104:3306
# Database user and password
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
# Connection charset
canal.instance.connectionCharset=UTF-8
# Default database
canal.instance.defaultDatabaseName=zgai_dbAdditional global settings are defined in /usr/local/canal/conf/canal.properties (e.g., canal.id, canal.ip, canal.zkServers) and Kafka sink configuration in /usr/local/canal/conf/kafka.yml:
# Kafka servers
servers: 192.168.56.101:9092
# Batch size (KB)
canalBatchSize: 50
# Topic
topic: mytopicInstance Management
After defining canal.destinations in canal.properties , create matching directories under conf/ and place an instance.properties file in each. If canal.auto.scan is enabled, Canal will automatically discover new, removed, or modified instance directories at the interval specified.
instance.xml Variants
Canal supports several instance.xml templates that determine how parser, sink, and store components persist state: spring/memory-instance.xml – all components in memory (fast, no HA). spring/file-instance.xml – file‑based persistence (single‑node, no HA). spring/default-instance.xml – Zookeeper + memory (supports HA). spring/group-instance.xml – logical grouping of multiple physical instances for sharded databases.
Key Parameter Highlights
canal.instance.mysql.slaveId– unique ID for MySQL slave. canal.instance.master.address – MySQL master host. canal.instance.filter.regex – Perl‑style regex to select tables for capture. canal.instance.memory.buffer.size – Buffer size (must be power of two). canal.instance.detecting.enable – Enable heartbeat checks. canal.instance.fallbackIntervalInSeconds – Seconds to look back when MySQL master switches.
Understanding the relationship between parse position (tracked by CanalLogPositionManager) and consume position (tracked by CanalMetaManager) is essential for reliable data synchronization.
Choose the appropriate instance.xml based on deployment needs: memory for quick testing, file for simple production without HA, default (Zookeeper) for high‑availability clusters, and group for multi‑database aggregation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
