Databases 12 min read

How We Scaled a High‑Traffic Messaging Service by Migrating MySQL to PolarDB

This article details the migration of a popular social app's private‑message service from a saturated MySQL cluster to PolarDB, covering business challenges, evaluation of storage‑optimization, vertical and horizontal scaling, the chosen distributed database solution, step‑by‑step offline and online migration procedures, and the resulting performance and cost benefits.

Inke Technology
Inke Technology
Inke Technology
How We Scaled a High‑Traffic Messaging Service by Migrating MySQL to PolarDB

Introduction

In‑app private messaging is a critical bridge for user interaction on the popular social app Yingke. Rapid user growth exhausted the original MySQL database, prompting a migration to PolarDB to eliminate storage bottlenecks and improve scalability.

Business Background

The current private‑message service uses a heavily read‑write MySQL setup with N databases and N tables, a Redis cache layer, and low CPU utilization while storage is near its limit.

Current Situation

High read/write volume

Database sharded across many tables

Standard SQL without special features

Redis cache sits in front of reads

Storage utilization at 85% and growing daily

Key challenges include continuous data growth and the need for a fast, business‑transparent migration that supports the MySQL protocol, massive storage, and dynamic scaling.

Migration Options Explored

Storage optimization

Vertical scaling

Horizontal scaling

Distributed database

Storage Optimization

1. Archive several years of data – frees ~20% space but introduces Redis‑DB consistency issues and requires lazy‑load logic. 2. Compress the content field – frees ~30% space but needs code changes and historical data migration.

Pros: No additional hardware cost. Cons: Requires program changes and scripts; long‑term storage limits remain.

Vertical Scaling

Increasing hardware capacity (e.g., larger disks) faces two problems: the current RDS instance already hits maximum disk size, and storage expansion often forces simultaneous compute scaling, raising costs without addressing the real bottleneck.

Pros: Transparent to the business, no code changes. Cons: Increases monthly cost and wastes compute resources.

Horizontal Scaling

Because the service already shards databases, we can split half of the tables to a new cluster, reducing per‑node storage. This requires data migration via DTS and program changes to clean data.

Pros: Can halve or further reduce storage usage. Cons: Requires data‑cleaning scripts, increasing development effort.

Distributed Database

After evaluating several products, PolarDB for MySQL was selected. Its compute‑storage separation, high availability, and horizontal scalability match our needs. Unlike traditional RDS MySQL, PolarDB stores a single data copy shared by all compute nodes, eliminating extra storage cost for replicas.

PolarDB also avoids additional storage cost when adding read replicas because all compute nodes share the same data.

Migration Implementation Strategy

Overview

Both offline (stop‑service) and online migration strategies were considered; the online approach was chosen.

We create a new PolarDB instance, enable DTS data sync, let DTS catch up, then switch traffic during a low‑traffic window to minimize impact.

Offline Migration Steps

Create PolarDB for MySQL instance

Enable DTS data synchronization

When sync catches up, stop service pods and wait for full consistency

Update application to point to PolarDB

Redeploy

Migration complete

Online Migration Steps

Preparation:

DBA creates PolarDB instance

DBA enables DTS sync

Developers add dual‑write connection info (MySQL + PolarDB)

Implement write‑pause using Redis switch

Use Go channels to buffer Add operations and Sleep to block Update/Delete during pause

Define migration switch states (1‑read/write MySQL, 2‑pause writes, 3‑dual‑write, 4‑read/write PolarDB)

During Migration:

Set switch to state 2 during low‑traffic period to stop writes

DBA monitors DTS until MySQL data fully syncs to PolarDB (≈1‑2 min)

Set switch to state 3 to start dual‑write

Both DBA and developers verify row counts, error logs, and private‑message functionality

Post‑Migration:

If errors appear, revert to state 1 (MySQL)

After 1‑2 days of stable operation, switch to state 4 (full PolarDB), remove MySQL connection, delete switch logic, and redeploy

DBA monitors MySQL traffic; if none, decommission the instance

Migration complete

Key Mechanism

A Redis flag allows the program to toggle between MySQL and PolarDB and to pause writes. During the pause, Add operations are buffered in a Go channel, and Update/Delete are blocked using Sleep, ensuring data consistency while the switch occurs.

Capacity calculations for the buffer (e.g., 500 Add QPM across 10 pods → 50‑element channel) show negligible memory impact.

Post‑Migration Metrics

Monitoring shows expected PolarDB metrics. Cost reduced by ~18% compared to the previous MySQL setup. P99 latency is around 40 ms, meeting business expectations. Compute‑storage separation now supports up to 100 TB.

Precautions

Ensure write‑pause does not cause panic‑induced data loss; test thoroughly and log persistently.

Online migration requires complete knowledge of all database operations to avoid incomplete data transfer.

Results

Minimal business impact compared with offline migration.

Cost reduction of 18% after migration.

P99 response time ~40 ms.

Storage and compute are decoupled, supporting up to 100 TB.

scalabilityDistributed DatabaseMySQLdatabase migrationPolarDBOnline Migration
Inke Technology
Written by

Inke Technology

Official account of Inke Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.