Databases 12 min read

Database Q&A: Book Recommendations, Engine Differences, Optimization, Sharding, and Operational Practices

The article compiles Meituan‑Dianping engineers' Q&A covering SQL book suggestions, relational versus NoSQL engine choices, performance tuning techniques, time‑series database uses, sharding versus partitioning strategies, query handling in distributed systems, proxy limitations, and practical advice on replication and pre‑database setups.

Meituan Technology Team

Sep 21, 2017

Database Q&A: Book Recommendations, Engine Differences, Optimization, Sharding, and Operational Practices

This article collects selected questions from the previous "You Ask I Answer" – Database special issue, organized by the Meituan-Dianping technical team. Readers can submit technical questions via the official WeChat account, and more than 5,000 engineers volunteer to answer them.

Q1: Could you recommend some books about SQL?

A: The Turing Press book "SQL必知必会 (4th Edition)" (SQL Essentials) is highly recommended. It is the Chinese version of the best‑selling SQL book on Amazon and explains concepts clearly. For English‑speaking readers, online tutorials such as https://www.sqlteaching.com/ and http://www.w3schools.com/sql/default.asp are also useful.

Q2: What are the underlying implementations and use cases of relational databases like MySQL versus NoSQL engines such as Redis and MongoDB?

A: MySQL is a relational database that supports transactions, indexes (including secondary indexes), MVCC, SQL joins, and stores structured data. Writes are relatively expensive and replication is single‑threaded, making it suitable for read‑heavy OLTP workloads.

Redis and MongoDB are NoSQL key‑value stores. Redis offers rich data structures, in‑memory queries, and Lua scripting, making it ideal for ultra‑low‑latency, hot‑data scenarios (e.g., flash sales) but not for bulk data imports. MongoDB provides single‑document transactions, automatic failover, and easy sharding, fitting large‑scale data scenarios.

Q3: How does Meituan-Dianping optimize databases for high performance? What are the SQL optimization methods?

A: Refer to two internal technical blogs: https://tech.meituan.com/mysql-index.html (index principles and slow‑query optimization) and https://tech.meituan.com/sqladvisor_pr.html (SQL optimization tools). The team also uses open‑source Box/Anemometer for slow‑query monitoring and an open‑source SQL Advisor for further tuning.

Q4: What are the application scenarios of time‑series databases and how do they differ from other NoSQL databases?

A: Time‑series databases store sequences of data points indexed by time, preserving full historical records for trend analysis, forecasting, and large‑scale analytics. They are typically built on LSM‑tree storage, a structure also common in many NoSQL systems, but focus on time‑ordered data rather than generic key‑value access.

Q5: How do TiDB and Google Spanner implement queries like SELECT ... ORDER BY key LIMIT 100 OFFSET 100 in a sharded environment?

A: If the key has no index, the worst case is to push the query to all shards and merge the results. With an index, each shard can use the index’s min/max values to filter early, though the exact details need confirmation from TiDB engineers.

Q6: What are the common database and SQL optimization strategies?

1. System level – vertically scale the database server hardware.

2. Server configuration – tune parameters such as connection buffer sizes.

3. Architecture level – move from single‑instance to multi‑instance (e.g., master‑slave) setups as needed.

4. Schema level – consider horizontal or vertical sharding (split databases/tables) based on business characteristics.

5. SQL level – design proper tables and indexes, use EXPLAIN to analyze execution plans, and adjust queries accordingly.

Q7: What are the pros and cons of MySQL partitioning vs. sharding, and suitable scenarios for each?

A: Partitioned tables require queries to include the partition key for effective pruning; they lack global indexes and are best for log‑type data with clear time‑based partitions and archival needs. Sharding (splitting tables across multiple databases) avoids large table sizes and index bloat, improving performance at the cost of added complexity; it is suitable when query simplicity and high throughput are prioritized.

Related articles: 大众点评订单系统分库分表实践 and MTDDL – Meituan‑Dianping Distributed Data Access Middleware .

Q8: How is table‑structure consistency verified after horizontal sharding, and is query merging handled at the application layer?

A: Consistency checks are lightweight; Meituan‑Dianping compares schema_info.tables across nodes. Query merging is performed by Zebra (a JDBC proxy), though it has limitations.

Q9: Are there open‑source Paxos‑based MySQL binlog synchronization implementations? Can Meituan‑Dianping’s MySQL Proxy (Atlas) flexibly support prepared statements or complex multi‑table joins?

A: MySQL Group Replication (MGR) implements Paxos for binlog synchronization. Atlas, the internal proxy, supports simple joins for non‑sharded tables but has many restrictions for sharded environments, including: only integer sharding keys, limited to SELECT/INSERT/REPLACE/DELETE/UPDATE, no support for INSERT … SET syntax, no cross‑shard LIMIT/SORT/GROUP BY/UNION, no auto‑increment columns in sharded tables, and many other SQL feature limitations.

Q10: What is a "pre‑database" and how should it be configured?

A: The term is not standard in database terminology. It may refer to a source database used for data pipelines. In MySQL, a common approach is to parse the binlog (using open‑source tools such as Canal, DataBus, or Pumer) to achieve low‑impact incremental synchronization.

For further database questions, scan the QR code below to join the "Database" WeChat group for discussion.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Optimization SQL Sharding mysql NoSQL TimeSeries

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.