Databases 24 min read

Mastering Horizontal Scaling with TDSQL: Design, Practices, and Performance

This article explains the motivations, challenges, and design principles of horizontal scaling for databases, using TDSQL as a case study to illustrate architecture, scaling procedures, shard key selection, high‑availability mechanisms, distributed transactions, and performance optimization techniques.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mastering Horizontal Scaling with TDSQL: Design, Practices, and Performance

Background and Challenges of Database Horizontal Scaling

Horizontal scaling is needed when business traffic or data volume exceeds the capacity of a single node, causing insufficient TPS, QPS, latency, or resource limits such as disk and network bandwidth. Compared with vertical scaling, which upgrades a single machine, horizontal scaling adds more machines to meet demand.

Horizontal Scaling vs Vertical Scaling

Vertical scaling increases CPU, memory, or storage of a single instance, often requiring a master‑slave switch in MySQL. Its limitation is dependence on a single machine’s resources, which eventually cannot satisfy rapid growth.

Horizontal scaling solves this by allowing theoretically unlimited expansion through adding nodes, but introduces complexity such as data partitioning, hotspot handling, data migration, routing changes, rollback, consistency, and performance linearity.

TDSQL Horizontal Scaling Practice

TDSQL Architecture

TDSQL consists of a SQL engine layer that abstracts storage details, a data storage layer composed of multiple SETs (each can be a primary‑multiple‑replica configuration), and a Scheduler module that monitors and controls the cluster, handling scaling and failover without business impact.

TDSQL Horizontal Scaling Process

The process starts with data initially split into 256 shards on a single node. Scaling moves these shards to additional nodes, increasing the number of SETs from 1 to up to 256, while the shard count remains constant. The UI allows one‑click addition of SETs and routing adjustments.

Design Principles Behind TDSQL Horizontal Scaling

Shard Key Selection for Compatibility and Performance

When creating tables, users specify a shard key field. This enables balanced data distribution and keeps related data on the same node, reducing cross‑node traffic. If no shard key is provided, TDSQL chooses one randomly, which may degrade performance.

High Availability and Reliability During Scaling

Data synchronization – a new instance is created and synchronized in real time without affecting business.

Data verification – continuous sync and verification until lag is within a small threshold (e.g., 5 seconds).

Routing update – writes are briefly frozen (seconds) while the new node catches up, then routing is switched atomically.

Redundant data deletion – delayed deletion avoids I/O spikes and ensures consistency.

Distributed Transactions

After scaling, data spans multiple nodes. TDSQL uses two‑phase commit to guarantee atomicity across nodes, making distributed transactions transparent to applications. The system is decentralized, allowing linear performance growth.

Achieving Linear Performance Growth

Key techniques include keeping related data on the same node via shard key design, parallel execution and stream aggregation across nodes, push‑down predicates to reduce data transfer, and data redundancy to minimize cross‑node access.

Practical Cases and Guidelines

Choosing a Shard Key

Typical choices are user ID for internet services, player ID for games, buyer/seller ID for e‑commerce, or device ID for IoT. A good shard key ensures balanced distribution and enables queries that include the key to be routed to a single node.

When to Scale

Scaling decisions are driven by monitoring metrics such as disk usage, CPU, or QPS. Thresholds (e.g., 80 % CPU) trigger scaling, and anticipated traffic spikes (e.g., upcoming promotions) can prompt proactive scaling.

Performance optimizationdistributed databasehorizontal scalingTDSQL
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.