How Vivo Leverages Alibaba Canal for Zero‑Downtime Data Migration and HA

This article explains how Vivo uses Alibaba's open‑source Canal to capture MySQL binlog changes, achieve non‑stop sharding and cross‑region data migrations, ensure high availability with Zookeeper, and share practical lessons on serialization, consistency, and monitoring in large‑scale backend systems.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
How Vivo Leverages Alibaba Canal for Zero‑Downtime Data Migration and HA

Introduction

Large single‑table data volumes (hundreds of millions of rows) require sharding, data migration and stable service performance. Canal, an Alibaba open‑source project, is used to capture MySQL binlog changes for incremental data subscription and consumption.

Canal Overview

Canal parses MySQL binary logs (binlog) by acting as a simulated MySQL slave. It sends a dump request to the master, receives the binlog stream, parses it into a protobuf‑defined structure and pushes the parsed events to downstream consumers.

MySQL Replication Basics

Canal depends on MySQL master‑slave replication. The master writes data changes to the binary log; the slave copies these events to its relay log and replays them locally.

Architecture

Key components:

Server : a JVM process that hosts one or more Canal instances.

Instance : a logical data queue.

Inside an instance:

EventParser : implements the MySQL slave protocol, receives and parses binlog events, records the current binlog position and forwards the data to EventSink.

EventSink : filters, aggregates, transforms and routes the parsed rows.

EventStore : persists parsed binlog objects and manages offsets for consumption acknowledgment.

MetaManager : maintains subscription metadata similar to a message‑queue broker.

Data Format

Canal wraps each binlog event into a protobuf message defined in EntryProtocol.proto. The main fields are:

Entry
  Header
    logfileName      // binlog file name
    logfileOffset    // position in the binlog
    executeTime      // event timestamp
    schemaName       // database name
    tableName        // table name
    eventType        // INSERT / UPDATE / DELETE
  entryType        // BEGIN / END / ROWDATA
  storeValue       // serialized RowChange

RowChange
  isDdl            // true for DDL statements
  sql              // DDL SQL text
  rowDatas         // list of row changes
    beforeColumns  // Column[] before the change
    afterColumns   // Column[] after the change

Column
  index
  sqlType
  name
  isKey
  updated
  isNull
  value

High Availability

Canal uses Zookeeper to elect a single active server instance and a single active client per instance. EPHEMERAL nodes and watcher mechanisms ensure that when the active server disappears, another server takes over, and clients reconnect to the new leader.

Typical Use Cases

Zero‑downtime migration : Incrementally sync source and target databases during sharding or region migration.

Cache refresh : Trigger asynchronous cache updates when underlying tables change.

Task dispatch : Convert row changes into MQ/Kafka messages for downstream processing.

Data heterogeneity : Aggregate data from multiple sharded tables into a unified view for complex queries.

Vivo Account Practical Cases

Case 1 – Sharding Migration

Problem: An account table exceeded 300 million rows, making full‑table migration costly and requiring downtime.

Solution: Use Canal for incremental change capture while migrating data in three phases – switch, dual‑write, and sharding.

Key steps:

Analyze pain points (large table, many unique user IDs, poor business partitioning).

Design a sharding scheme (e.g., hash‑based or range‑based on user ID).

Perform full‑data migration with traditional scheduled jobs, then enable Canal to sync incremental changes.

Transition from single‑write → dual‑write → sharding mode, monitoring for issues.

Result: After two weeks of dual‑write, the system switched to sharding with no major incidents; minor issues were resolved quickly.

Case 2 – Cross‑Region Migration

Problem: GDPR compliance required moving Australian user data from a Singapore data center to an EU data center without service interruption.

Solution: Deploy a standby replica in Singapore, use Canal to capture binlog, encrypt the changes, transmit them to the EU region, then switch DNS after verification.

Steps:

Build a standby MySQL instance in Singapore and enable binlog replication.

Deploy Canal server and client to consume the binlog.

Parse and encrypt each change before sending it to the EU GDPR‑compliant zone.

Store the data in the EU MySQL instance and verify stability.

Redirect traffic by updating DNS to point to the EU instance.

Stop Canal services and clean up the Singapore data.

Lessons Learned

Data serialization : Canal uses protobuf; null values are converted to empty strings, which can cause mismatches in ORM updates.

Data consistency : A single‑node Canal client may process out‑of‑order updates, leading to overwrites (e.g., phone‑number change race).

Master‑slave lag : High write rates increase replica lag; applying rate‑limiting based on business load mitigates the issue.

Monitoring : Simple in‑memory counters were added to detect anomalies, but coarse granularity missed some problems, highlighting the need for richer metrics.

References

Official Canal repository: https://github.com/alibaba/canal

Related Otter project: https://github.com/alibaba/otter

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendData Migrationhigh availabilityZooKeepermysqlBinlogCanal
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.