Databases 20 min read

Mastering Multi‑Tenant Load Balancing in Alibaba Cloud Table Store

This article explains the architecture, data model, and multi‑tenant load‑balancing strategies of Alibaba Cloud Table Store, detailing the challenges of distributed NoSQL systems and presenting practical solutions for resource quantification, fairness, trigger timing, and SLA‑driven automation.

dbaplus Community
dbaplus Community
dbaplus Community
Mastering Multi‑Tenant Load Balancing in Alibaba Cloud Table Store

Table Store Overview

Table Store is a distributed NoSQL storage service built on Alibaba Cloud’s Feitian platform. It provides single‑table read/write throughput at the ten‑million‑operations‑per‑second level and stores data at the tens‑of‑petabytes scale while offering 99.99% availability and eleven‑nine durability. The service runs on a large shared resource pool and uses a load‑balancer to allocate resources among tenants, reducing cost and improving overall efficiency.

Motivation for NoSQL and Load Balancing

Massive and unpredictable data volume and traffic make traditional monolithic databases hard to scale.

Weak inter‑data relationships favor a partitioned, range‑based model.

Frequent schema changes require flexible storage without costly re‑sharding.

These constraints drive the adoption of a distributed NoSQL system such as Table Store, where load balancing is essential to keep resource utilization even across a dynamic cluster.

Architecture

The system consists of shared infrastructure components and a Table Store engine layer.

Shared services : a distributed lock service (Nuwa), logging modules, security and data‑center management that are used by all roles.

Distributed file system : Pangu, Alibaba’s HDFS‑like storage, provides a common data store accessible by any worker.

Engine layer : a master service, a load‑balancer, and multiple worker processes.

The master stores table metadata (name, partition count, compression, cache policies) and assigns each logical partition to a worker. Workers load the assigned partitions and serve read/write requests. The load‑balancer consumes statistics from the master and workers to make scheduling decisions.

Data Model and Partitioning

Table Store uses a range‑based partitioning scheme. Tables are ordered by primary key (PK) and split into contiguous partitions. Each partition consumes a fixed amount of CPU, memory, network bandwidth, and disk IOPS, and is treated as a logical tenant. For example, a PK range AA‑FF could be divided into [AA‑CB) and [CB‑FF), each acting like an independent database instance.

Multi‑Tenant Load Balancing

Resource Quantification

During request execution the system records three key metrics:

Data size – estimates inbound/outbound network bandwidth.

Latency – approximates CPU consumption (by subtracting I/O latency).

Memory usage – measured directly.

These values are collected by a view‑statistics module and aggregated per tenant (i.e., per partition) in a circular queue.

Water‑Level Monitoring and Fair Flow Control

Each worker runs a resource‑monitor thread that reads OS‑level counters (e.g., NIC transmit rate). When a resource exceeds a configurable threshold (commonly 95% of NIC capacity), the thread triggers flow control. The system then rejects a proportion of incoming requests based on each tenant’s contribution to the overloaded resource, using a token‑bucket style algorithm. This protects small tenants from being starved by large ones.

Trigger Conditions

A worker enters flow‑control state (resource water‑level breach).

A tenant’s SLA metric falls below a predefined threshold.

Significant imbalance of resource utilization across the cluster.

When any trigger fires, the load‑balancer decides among three actions: split a hot partition, migrate partitions to less‑loaded workers, or isolate a problematic tenant on a dedicated machine.

Partition Splitting Strategy

Instead of naïvely cutting a range in half, the splitter considers per‑tenant resource weights collected by the statistics module. The split point is chosen so that the resulting partitions have balanced traffic for the dominant resource (network, CPU, etc.). This weighted split reduces the likelihood of creating a new hotspot.

Load‑Balancing Evolution (0→1, 1→N)

0→1 (manual, machine‑level) : Early deployments relied on operators manually assigning whole machines to large tenants to avoid interference.

1→N (automated, partition‑level) : The mature system treats every partition as an equal scheduling unit, regardless of tenant size. The load‑balancer can automatically:

Split a single overloaded partition into two balanced partitions.

Migrate selected partitions from a congested worker to a lightly loaded one.

Isolate a misbehaving tenant on a dedicated worker and raise an alarm for operators.

Key Challenges and Solutions

Accurate resource accounting : Direct measurement of CPU, memory, network, and I/O per request is infeasible; the system uses proxies (size → network, latency → CPU) and aggregates them asynchronously.

Fairness for rapidly growing partitions : The flow‑control module applies weighted rejection probabilities, ensuring that tenants consuming the most of a saturated resource are throttled proportionally.

Trigger design and SLA feedback loop : Three progressive triggers (flow‑control, SLA breach, imbalance) create a closed loop that continuously drives the system toward the SLA target.

SLA‑Driven Closed Loop

The effectiveness of the load‑balancing system is measured by:

Time to mitigate a sudden traffic surge (seconds to tens of seconds).

Reduction of SLA violations (e.g., latency percentiles staying within contract).

Improved overall resource utilization (higher average CPU/network usage without hotspots).

When the trigger mechanisms and evaluation metrics are tightly coupled, the load balancer becomes a self‑evolving component that automatically adjusts partition placement and size to keep the service as close to its SLA limits as possible.

Illustrative Diagrams

Table Store features
Table Store features
Architecture diagram
Architecture diagram
Data model partitioning
Data model partitioning
Load‑balancing capabilities
Load‑balancing capabilities
From 0 to 1
From 0 to 1
From 1 to N
From 1 to N
Resource quantification
Resource quantification
Water‑level flow control
Water‑level flow control
Trigger timing
Trigger timing
Split point selection
Split point selection
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsmulti-tenantNoSQLAlibaba CloudTable Store
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.