Databases 14 min read

Database Optimization, Sharding, and Pagination Issues: Lessons from a Large‑Scale MySQL Deployment

The article shares a senior architect’s experience with large‑scale MySQL tables, discussing why sharding and Elasticsearch were introduced, the pagination bug caused by Sharding‑JDBC, and a comprehensive set of hardware, software, and SQL tuning recommendations, including a full MySQL configuration example.

Top Architect
Top Architect
Top Architect
Database Optimization, Sharding, and Pagination Issues: Lessons from a Large‑Scale MySQL Deployment

Story Background

The author, a senior architect, explains that traditional advice on database optimization often emphasizes indexing and sharding when tables exceed tens of millions of rows, but he argues that sharding can be a trap based on his recent bug experience.

System Complexity Increases

A device table has grown to 500‑600 million rows. Four years ago, the team split the data into four databases and five tables using sharding, but queries were routed to Elasticsearch (ES) for convenience. Logstash was used to sync data from MySQL to ES every 20 seconds, adding further complexity.

Pagination Issue

A mysterious bug caused pagination to always return the first page for a specific device table. Investigation revealed that the table’s data source was configured through Sharding‑JDBC, which does not fully support the pagination SQL used, leading to the failure.

Two possible solutions were proposed: upgrade Sharding‑JDBC to a version (≥4.1) that fixes the bug, or separate the sharded data source from the single‑table data source and use a regular datasource for the latter.

Database Optimization Overview

Optimization can be approached from hardware, software, and SQL layers.

1. Hardware Optimization

Use SSDs instead of HDDs for faster random I/O.

Adopt RAID 10 to increase storage bandwidth and redundancy.

Expand memory to allow larger buffer pools and reduce I/O.

2. Software Optimization

Adjust InnoDB buffer pool size, log buffer, and other MySQL parameters.

Enable semi‑synchronous replication and appropriate session tracking.

Consider distributed databases (TiDB, MongoDB) for high‑concurrency workloads.

3. SQL Layer Optimization

Avoid SELECT * and only query needed columns.

Create indexes on columns used in ORDER BY and GROUP BY .

Replace large OFFSET pagination with keyset pagination, e.g., WHERE id > last_seen_id .

Use MySQL EXPLAIN and other analysis tools to tune queries.

4. Comprehensive Optimization

Regularly review slow‑query logs and clean up unused indexes.

Archive historical data to cold storage.

Use a connection pool such as HikariCP to reduce connection overhead.

Configuration Example

[universe]
bakupdir = /data/mysql/backup/7360
iops = 0
mem_limit_mb = 0
cpu_quota_percentage = 0
quota_limit_mb = 0
scsi_pr_level = 0
mycnf = /opt/mysql/etc/7360/my.cnf
run_user = actiontech-mysql
umask_dir = 0750
umask = 0640
id = mysql-mt1cbg
group_id = mysql-test

[mysql]
no-auto-rehash
prompt = '\u@\h:\p\R:\m:\s[\d]> '
#default-character-set = utf8mb4
#tee = /data/mysql_tmp/mysql_operation.log

[mysqld]
super_read_only = 1
# DO NOT MODIFY, Universe will generate this part
port = 7360
server_id = 123
basedir = /opt/mysql/base/5.7.40
datadir = /data/mysql/data/7360
log_bin = /opt/mysql/log/binlog/7360/mysql-bin
tmpdir = /opt/mysql/tmp/7360
relay_log = /opt/mysql/log/relaylog/7360/mysql-relay
innodb_log_group_home_dir = /opt/mysql/log/redolog/7360
log_error = /data/mysql/data/7360/mysql-error.log
report_host = xxx

# BINLOG
binlog_error_action = ABORT_SERVER
binlog_format = row
binlog_rows_query_log_events = 1
log_slave_updates = 1
master_info_repository = TABLE
max_binlog_size = 250M
relay_log_info_repository = TABLE
relay_log_recovery = 1
sync_binlog = 1

# GTID
gtid_mode = ON
enforce_gtid_consistency = 1
binlog_gtid_simple_recovery = 1

# ENGINE
default_storage_engine = InnoDB
innodb_buffer_pool_size = 64G
innodb_data_file_path = ibdata1:1G:autoextend
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 1
innodb_flush_method = O_DIRECT
innodb_io_capacity = 1000
innodb_log_buffer_size = 64M
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 60
innodb_print_all_deadlocks = 1
innodb_strict_mode = 1
innodb_max_undo_log_size = 4G
innodb_undo_log_truncate = 1
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_purge_threads = 4
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_buffer_pool_dump_pct = 25
innodb_sort_buffer_size = 8M
innodb_buffer_pool_instances = 8
innodb_lock_wait_timeout = 10
innodb_io_capacity_max = 2000
innodb_flush_neighbors = 1
innodb_thread_concurrency = 64
innodb_stats_persistent_sample_pages = 64
innodb_autoinc_lock_mode = 2
innodb_online_alter_log_max_size = 1G
innodb_open_files = 4096
innodb_temp_data_file_path = ibtmp1:12M:autoextend:max:50G
innodb_rollback_segments = 128

# CACHE
key_buffer_size = 16M
tmp_table_size = 64M
max_heap_table_size = 64M
table_open_cache = 2000
query_cache_type = 0
query_cache_size = 0
max_connections = 3000
thread_cache_size = 200
open_files_limit = 65535
binlog_cache_size = 1M
join_buffer_size = 8M
sort_buffer_size = 2M
read_buffer_size = 8M
read_rnd_buffer_size = 8M
table_definition_cache = 2000
table_open_cache_instances = 8

# SLOW LOG
slow_query_log = 1
slow_query_log_file = /data/mysql/data/7360/mysql-slow.log
log_slow_admin_statements = 1
log_slow_slave_statements = 1
long_query_time = 1

# SEMISYNC
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_slave_enabled = 0
rpl_semi_sync_master_wait_for_slave_count = 1
rpl_semi_sync_master_wait_no_slave = 0
rpl_semi_sync_master_timeout = 30000

# CLIENT_DEPRECATE_EOF
session_track_schema = 1
session_track_state_change = 1
session_track_system_variables = '*'

# MISC
log_timestamps = SYSTEM
lower_case_table_names = 1
max_allowed_packet = 64M
read_only = 1
skip_external_locking = 1
skip_name_resolve = 1
skip_slave_start = 1
socket = /data/mysql/data/7360/mysqld.sock
pid_file = /data/mysql/data/7360/mysqld.pid
disabled_storage_engines = ARCHIVE,BLACKHOLE,EXAMPLE,FEDERATED,MEMORY,MERGE,NDB
log-output = TABLE,FILE
character_set_server = utf8mb4
secure_file_priv = ""
performance-schema-instrument = 'wait/lock/metadata/sql/mdl=ON'
performance-schema-instrument = 'memory/% =COUNTED'
expire_logs_days = 7
max_connect_errors = 1000000
interactive_timeout = 1800
wait_timeout = 1800
log_bin_trust_function_creators = 1

# MTS
slave-parallel-type = LOGICAL_CLOCK
slave_parallel_workers = 16
slave_preserve_commit_order = ON
slave_rows_search_algorithms = 'INDEX_SCAN,HASH_SCAN'

# BaseConfig
collation_server = utf8mb4_bin
explicit_defaults_for_timestamp = 1
transaction_isolation = READ-COMMITTED

Conclusion

The author advises that while sharding is a common interview answer, real‑world experience shows that other optimization techniques—such as hardware upgrades, proper indexing, query rewriting, and selective use of distributed databases—often yield better results.

optimizationdatabaseElasticsearchShardingMySQLpaginationLogstash
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.