Databases 12 min read

Mastering Database Sharding: Top Interview Questions and Best Middleware Choices

This article breaks down common interview questions on database sharding, explains why and how to split databases horizontally and vertically, compares popular sharding middleware, and offers practical recommendations for choosing the right solution in high‑concurrency systems.

Java Backend Technology
Java Backend Technology
Java Backend Technology
Mastering Database Sharding: Top Interview Questions and Best Middleware Choices

Interview Questions

Why do we need database sharding? Which sharding middleware have you used? What are the advantages and disadvantages of each middleware? How do you perform vertical or horizontal splitting of databases?

Interviewer Psychology

Sharding is a staple topic in high‑concurrency system design; interviewers expect candidates to understand it, and lacking this knowledge is a clear gap.

Question Analysis

Sharding (分库分表) is often confused; it can involve splitting only databases, only tables, or both. Consider a startup that starts with 200 k registered users, 10 k daily active users, and modest traffic. As the business scales to millions of users and thousands of requests per second, a single database quickly becomes a bottleneck, prompting the need for sharding.

Sharding vs. Partitioning

Table sharding (分表) : Split a large table into multiple tables, each holding a subset of rows (e.g., by user ID) so that each table stays within a manageable size, typically a few million rows.

Database sharding (分库) : Distribute data across multiple database instances; each instance handles a portion of the overall load, keeping per‑instance QPS around 1 000–2 000 for stability.

Diagram of sharding concepts
Diagram of sharding concepts

Sharding Middleware

Common middleware includes:

cobar : An Alibaba B2B team proxy solution, now largely abandoned and lacking support for read/write splitting, stored procedures, cross‑database joins, and pagination.

TDDL : Developed by the Taobao team, a client‑side solution supporting basic CRUD and read/write splitting but dependent on Alibaba's Diamond configuration service.

atlas : 360’s open‑source proxy, outdated with no recent community activity.

sharding‑jdbc : Dangdang’s client‑side library, actively maintained (v2.0), supporting sharding, read/write splitting, distributed ID generation, and flexible transactions.

mycat : A proxy built on cobar, widely used, offering comprehensive features but newer and less battle‑tested than sharding‑jdbc.

Summary

For most cases, consider sharding‑jdbc (client‑side) for its low operational cost and high performance, or mycat (proxy‑side) when you need transparent integration across many projects. Small‑to‑medium companies benefit from sharding‑jdbc’s simplicity, while large enterprises with many teams may prefer mycat’s centralized management.

Vertical and Horizontal Splitting

Horizontal splitting distributes rows of a single table across multiple databases/tables with identical schemas, balancing load and storage capacity.

Vertical splitting separates a wide table into multiple tables/databases, each containing a subset of columns, allowing frequently accessed columns to be cached more efficiently.

Horizontal splitting illustration
Horizontal splitting illustration
Vertical splitting illustration
Vertical splitting illustration

Common Splitting Strategies

Range‑based: Allocate consecutive key ranges (often time‑based) to each shard; easy to add new shards but can cause hotspot traffic on recent data.

Hash‑based: Distribute rows by hashing a key (e.g., user ID); balances load evenly but makes scaling more complex due to data migration.

Choose the strategy that matches your workload characteristics and scaling requirements.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ScalabilityInterview Preparationsharding middleware
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.