Databases 12 min read

How Quora Optimizes MySQL Reads, Writes, and Storage with Caching and MyRocks

This article explains how Quora reduces database load by optimizing read patterns, improving high‑QPS queries with smarter caching, shrinking table size using MyRocks, and boosting write throughput through parallel replication settings, providing concrete steps, code snippets, and real‑world results.

JavaEdge
JavaEdge
JavaEdge
How Quora Optimizes MySQL Reads, Writes, and Storage with Caching and MyRocks

0 Database Load Main Components

Read

Data volume

Write

1 Optimize Reads

1.1 Different Read Types Need Different Optimizations

Complex queries (joins, aggregations)

When counting becomes a bottleneck, pre‑compute the counts in a separate table so that the application can read the count directly instead of recalculating it on each request.

Large scans

Apply LIMIT or pagination to restrict the number of rows scanned.

Schema‑query mismatches

Typical problems include missing indexes, indexes that do not cover enough columns, or column order in the index that is sub‑optimal for the query. Common fixes are:

Remove unnecessary columns from the SELECT list, especially those not covered by an index.

Eliminate ORDER BY clauses and perform sorting on the client side (MySQL CPU is usually more valuable than client CPU).

Delete queries that no longer provide needed functionality.

1.2 High‑QPS Queries

Even with optimized SQL and a good schema, queries that receive very high QPS can overload the database, often indicating ineffective caching.

Inefficient cache keys

Overly specific or narrow keys cause many cache misses, leading to high QPS.

Overly broad keys pull large data sets from the DB on each request.

User‑language table

Original cache key (uid, language_id) was too granular. Switching to a key based only on uid reduced QPS by over 90% while only slightly increasing the amount of data fetched per query.

A2A (ask‑to‑answer) table

The table links users, questions, and answerers, resulting in very high QPS. Two cache keys were used: (question_id, user_id) – users who have already requested answers. (question_id, answerer_id) – users who have requested the same answerer.

Because the key space is sparse (product of question count and user count), a supplemental cache keyed only by question_id was added, limiting entries to N per question. This cache reduced A2A QPS by 50‑66%.

1.3 Sparse One‑Dimensional Datasets

When caching individual question_id values, most entries are “no redirect,” wasting space. Quora switched to caching ranges (e.g., 123–127) as a single “no redirect” entry, cutting related QPS by about 90%.

2 Optimize Table Space

Table size impacts storage cost, I/O pressure, backup time, and overall performance. Quora migrated older shards to MyRocks (a RocksDB‑based storage engine) using a binlog‑replay tool and shadow‑read validation.

Create an empty table on the MyRocks master with the same schema.

Copy data and replay the binlog from the MySQL master to the MyRocks master.

Run shadow‑read tests to ensure identical results.

Switch traffic to the MyRocks shard.

Space savings: >80% for one shard, 50‑60% for another.

3 Optimize Writes

MySQL replication replays writes serially on replicas, limiting write scalability on multi‑core machines. Parallel replication can be enabled by configuring slave_parallel_workers and one of the following types: slave_parallel_type=LOGICAL_CLOCK (MySQL 8.0.26+; formerly replica_parallel_type) – works even when all tables reside in the same logical database. slave_parallel_type=DATABASE – requires tables to be spread across multiple logical databases. Tables can be moved with:

ALTER TABLE logical_db1.table RENAME logical_db2.mytable;

The logical‑database mapping is stored in ZooKeeper, allowing dynamic re‑assignment without code changes.

4 Conclusion

Quora combines read, write, and storage optimizations—caching strategies, MyRocks migration, and parallel replication—to dramatically reduce database load. Similar techniques can be applied to improve performance in other systems.

Reference: https://www.percona.com/blog/scaling-mysql-a-good-problem-to-have/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlDatabase OptimizationMyRocksRead/Write Performance
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.