Databases 14 min read

How to Split a Percona XtraDB Cluster into Multiple Instances on a Single Node

This article details a real‑world engineering case where a Percona XtraDB Cluster (PXC) was re‑architected to run separate OP and CLM MySQL instances on the same physical server, covering background, problem analysis, design choices, port configuration, directory restructuring, configuration files, step‑by‑step commands, automation script updates, and post‑change verification.

dbaplus Community
dbaplus Community
dbaplus Community
How to Split a Percona XtraDB Cluster into Multiple Instances on a Single Node

Case Background

Over the past two years the author, a DBA at China Mobile, has been responsible for the "Mobile Cloud" database operations. When a new resource pool is added, a standardization transformation is performed to ensure consistent maintenance. The following case study describes a PXC cluster single‑node multi‑instance split.

Problem Summary

Normally each business database is deployed on a separate MySQL instance, but due to resource constraints and rapid rollout, two subsystems (OP and CLM) were deployed on the same physical machine and even the same MySQL instance, causing high coupling and difficult fault isolation. An incident where the CLM system locked the database instance prevented the OP portal from loading, highlighting the risk.

PXC Overview

Percona XtraDB Cluster (PXC) is a MySQL‑compatible cluster based on Galera, providing real‑time synchronization, write‑set broadcasting, and transaction conflict rollback, ensuring strong data consistency.

System Names

OP : Mobile Cloud portal system for user self‑service product ordering.

CLM : Resource pool management platform for performance data collection, monitoring, and alerting.

PXC Cluster Single‑Node Multi‑Instance Split

Transformation Plan

Two options were considered:

Deploy two virtual machines on the physical host, each running a separate PXC cluster – longer downtime.

Split instances on the same host, re‑plan ports, directories, PID and socket paths – short downtime (≤15 minutes) with minimal impact on OP.

The second option was chosen.

Current Architecture

The OP and CLM databases originally shared a single instance, as shown in the diagram below.

The PXC cluster runs in multi‑master mode, providing read/write on every node. Applications connect via HAProxy VIPs. Service ports: 3306 (MySQL), 4567 (cluster communication), 4444 (SST), 4568 (IST).

Splitting Principles

Vertical split: OP stays on the original instance; CLM moves to a new instance.

Use "expand then split" to keep OP online while CLM is being migrated.

Port Design

OP retains its original ports; CLM uses new ports to differentiate the two instances on the same host.

External MySQL ports: OP 3306, CLM 3307

Cluster communication ports: OP 4567, CLM 4667

SST ports: OP 4444, CLM 4445

IST ports: OP 4568, CLM 4668

HAProxy configuration for CLM is updated to use port 3307.

Directory Split

OP keeps its standard directories. CLM receives new data, log, program directories, and new PID/socket paths, as illustrated:

Configuration Files

OP continues using its original my.cnf. A new my_clm.cnf is created for CLM, adjusting ports, directories, logs, PID and socket paths.

Because two configuration files exist, CLM must be started with its specific file, e.g.:

mysql_safe --defaults-file=/apps/conf/bcrdb/my_clm.cnf --wsrep-new-cluster &

Split Steps

Step 1: On node RW3, create CLM directories, symlinks, and my_clm.cnf; re‑plan ports, PID, and socket paths; then stop the MySQL process.

Step 2: Start the new CLM instance with the original my.cnf to join the existing cluster and perform SST data sync.

Step 3: After CLM syncs, shut down the CLM instance (

mysqladmin --defaults-file=/apps/conf/bcrdb/my.cnf -uroot -p shutdown

) and restart the OP node to restore its cluster state.

Step 4: Update HAProxy to expose CLM on port 3307 and start the CLM cluster with my_clm.cnf:

mysql_safe --defaults-file=/apps/conf/bcrdb/my_clm.cnf --wsrep-new-cluster &

Step 5: Repeat steps 1‑4 for the remaining CLM nodes (RW1, RW2), adjusting my_clm.cnf as needed.

Step 6: Modify automation scripts to use the new configuration and paths. Example backup script excerpt:

full_backup() {
  echo "$(date +%F\ %T)  start full backup..."
  innobackupex --defaults-file=/apps/svr/bcrdb/conf/my_clm.cnf \
    --slave-info --galera-info --user=$bakuser --password=$bakpwd \
    --no-timestamp --kill-long-query-type=select \
    --kill-long-queries-timeout=20 \
    --rsync --history=${bakname} ${bakdir}/${bakname}_full
  [ $? -ne 0 ] && echo "$(date +%F\ %T)  backup failed!" && exit 1
  echo "$(date +%F\ %T)  backup complete!"
}

incr_backup() {
  lastname=`$mysqlcmd "select name from xtrabackup_history where is_success='Y' and name like 'BCRDB_BC_CLM%' order by end_time desc limit 1"`
  if [ "$lastname" != "NULL" ] && [ -n "$lastname" ]; then
    echo "$(date +%F\ %T)  start incremental backup since $lastname..."
    innobackupex --defaults-file=/apps/svr/bcrdb/conf/my_clm.cnf \
      --slave-info --galera-info --user=$bakuser --password=$bakpwd \
      --no-timestamp --kill-long-query-type=select \
      --kill-long-queries-timeout=20 --rsync --history=${bakname} \
      --incremental --incremental-history-name=$lastname ${bakdir}/${bakname}_incr
    [ $? -ne 0 ] && echo "$(date +%F\ %T)  backup failed!" && exit 1
    echo "$(date +%F\ %T)  backup complete!"
  else
    echo "$(date +%F\ %T)  there is no backup history, can't start incremental backup, start full backup instead..."
    full_backup
  fi
}

Step 7: Create crontab entries for backup and Zabbix monitoring.

Backup:

30 2 * * * /apps/sh/backup_clm.sh -d 60 -n bcclm -t auto>/apps/sharedstorage/logs/bcrdb_clm/backuplog/backup_$(date '+%Y%m%d_%H%M%S').log 2>&1

Zabbix:

*/1 * * * * /usr/zagt/zabbix_scripts/zbx_mysql_status.sh > /dev/null

Change Review

The split was completed successfully; OP services were unaffected, while CLM lost a small amount of performance data during the brief outage. Port changes caused no network issues.

Conclusion

Thorough pre‑change planning, detailed command collections, and rollback strategies are essential when performing engineering changes on production databases, especially when business services run concurrently.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AutomationmysqlDatabase operationsHAProxyPercona XtraDB ClusterPXCInstance Splitting
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.