DBLE String‑Hash Sharding Algorithm and Configuration Guide
This article explains DBLE's string‑hash sharding algorithm, detailing how to configure partitionLength, partitionCount, and hashSlice in rule.xml, the conversion of string indexes to integers, comparison with MyCat, development and operational considerations, and provides example XML configurations and best‑practice tips.
About the author
钟 悦 - 资深DBLE用户 某宇宙行资深架构师,在大型重点项目中使用 DBLE。常年与 MySQL 纠缠不清,经常运用技术处理大企业病的技术或非技术问题。
When the sharding key is a non‑numeric string, the built‑in integer hash algorithm cannot be used directly. The stringhash algorithm extracts a user‑defined slice of the key, converts each character’s Unicode value into a long integer using a polynomial accumulation (multiply by 31 and add), then calls the built‑in hash to compute the shard routing: first modulo to obtain a logical shard number, then map the logical shard to a physical shard.
Users need to define partitionLength[] and partitionCount[] arrays and the hashSlice tuple in rule.xml .
During DBLE startup, the dot product of the two arrays yields the modulus, i.e., the number of logical shards.
The cross product of the two arrays produces a mapping table from logical shards to physical shards (the total number of physical shards equals the sum of the elements in partitionCount[] ).
According to the hashSlice tuple, characters 4 to 5 (0‑based indices 3 to 4) of the sharding key are used for the "string‑>int" conversion.
At runtime, when a user queries a table that uses this algorithm, the WHERE clause’s sharding key is extracted, characters 4‑5 are taken, and fed into the conversion process.
A cumulative value starts at 0; for each extracted character, the cumulative value is multiplied by 31 and the character’s Unicode value (treated as a long) is added. After processing all characters, the cumulative value represents the integer form of the sharding key.
The cumulative value is then modulo‑ed to obtain the logical shard number.
The logical shard number is looked up in the mapping table to directly obtain the physical shard number.
Comparison with MyCat's similar sharding algorithm
Both algorithms behave the same after converting the string to an integer; the differences inherit from the underlying hash algorithm.
Development notes
Sharding key must be a string.
Maximum physical shard configuration: the sum of partitionCount[] must equal 2880.
Example configuration (single value): <property name="partitionLength">1</property> <property name="partitionCount">2880</property>
Minimum physical shard configuration: the sum of partitionCount[] must equal 1. <property name="partitionLength">2880</property> <property name="partitionCount">1</property>
partitionLength and partitionCount are one‑dimensional comma‑separated arrays; their dot product must be within [1, 2880].
The order of elements in partitionLength and partitionCount is significant.
Example differing configurations produce different shard results: <property name="partitionLength">512,256</property> <property name="partitionCount">1,2</property> versus <property name="partitionLength">256,512</property> <property name="partitionCount">2,1</property>
If the sharding key length is shorter than the configured slice, the slice length is safely reduced.
Longer slice lengths improve data distribution; lower repetition of key content also helps uniform distribution.
Operational notes
Scaling out : Pre‑over‑shard without changing the dot product of partitionLength and partitionCount or the hashSlice setting to avoid data rebalancing; only migrate affected data.
If you need to change the dot product or hashSlice , a data rebalancing is required.
Scaling in : Same principle as scaling out; avoid rebalancing by keeping the dot product unchanged.
Configuration notes
In rule.xml the configurable items are <property name="partitionLength"> , <property name="partitionCount"> and <property name="hashSlice"> . The value format for partitionLength and partitionCount is a comma‑separated list of integers, e.g.:
<property name="partitionLength">512,256</property> <property name="partitionCount">1,2</property>The semantics: a physical shard holding 512 logical shards has 1 instance, followed by a physical shard holding 256 logical shards with 2 instances.
If all elements of partitionLength[] are 1, the algorithm reduces to a simple modulo operation.
The hashSlice tuple defines which characters of the sharding key are used. Examples:
To take the first k characters: 0:k , k or :k .
To take the last k characters: -k:0 , -k or -k: .
To take n characters starting at position m (0‑based): compute i=m-1 , j=i+n-1 , then use i:j .
To take n characters from the m ‑th character from the end: compute i=-m+n-1 and use -m:i .
To disable slicing: 0:0 , 0: , :0 or : .
DBLE sharding algorithm series recap
First article: hash sharding
Recent community activity
June 15 Shanghai Station
Distributed Middleware DBLE User Meetup – the first offline interactive sharing session since DBLE’s release in October 2017.
Location: iKangSheng R&D Center, Shanghai, Xuhui District, Hongmei Road 1905, Building A, 7th floor.
Time: June 15, 2019, 13:00‑17:00.
Join us for face‑to‑face interaction with developers, testers, product and community teams, and enjoy exclusive merchandise.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.