Database Sharding and Partitioning: Strategies, Practices, and Considerations
This article compiles practical discussions on database sharding and partitioning, covering criteria such as user ID, time, and hash, redundancy techniques, caching with Redis, query handling, pagination, consistency, and distributed transaction approaches for scalable backend systems.
Various participants share their experiences and recommendations on how to split databases and tables (sharding) based on business needs, data volume, and access patterns, often using user IDs, time ranges, or hash functions.
Horizontal sharding creates multiple tables with the same schema, while vertical sharding restructures a table by moving less‑used columns to separate tables, both aiming to improve storage capacity and query efficiency.
Redundant data stores (e.g., MySQL, HBase, Cassandra) and caching solutions like Redis with consistent hashing are suggested to handle queries that do not align with the primary sharding key, as illustrated by examples of user information and product inventory.
When queries span multiple shards, approaches include scanning each shard and merging results, using proxy or business‑layer logic, or employing search engines (Elasticsearch, Hadoop/Hive) for large‑scale aggregation.
Pagination across shards depends on the sharding strategy; time‑based shards allow straightforward pagination within a single shard, while cross‑shard listings may require additional mechanisms such as search indexes.
For high‑traffic scenarios, caching, consistent hashing, and careful selection of sharding fields are emphasized to avoid hot spots and maintain performance.
Distributed transactions are generally discouraged; instead, soft‑transaction techniques using atomic locks in Redis or in‑process mutexes are recommended, with full distributed transaction support reserved for complex, large‑scale systems.
Example SQL transaction across shards:
start transaction;
select * from 1 where A = 1234;
update table1 set score = score + x where A = 1234 and score > x;
update table2 set score = score - x where B = 3456 and score > x;
commit; // or rollbackOverall, the discussion highlights that sharding and partitioning decisions must be driven by concrete business queries, data growth patterns, and operational constraints, with careful planning for redundancy, indexing, and eventual consistency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Nightwalker Tech
[Nightwalker Tech] is the tech sharing channel of "Nightwalker", focusing on AI and large model technologies, internet architecture design, high‑performance networking, and server‑side development (Golang, Python, Rust, PHP, C/C++).
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
