Databases 8 min read

How Facebook Scales 2B Users with MySQL and the New Apollo NoSQL Engine

Since its inception, Facebook has relied on MySQL to handle data from over two billion users, but recent shifts toward NoSQL have led to the development of Apollo—a layered storage system inspired by Paxos, Raft, RocksDB, and custom APIs, aiming to improve scalability, latency, and fault tolerance.

21CTO
21CTO
21CTO
How Facebook Scales 2B Users with MySQL and the New Apollo NoSQL Engine

Since its founding, Facebook has been using MySQL. Most people are curious about the optimization methods that allow this social network to support data from over two billion users and run smoothly without obstacles.

In fact, when people talk about scalable databases, Facebook is almost always mentioned. In recent years, database management and growth has no more significant success story than Facebook.

Facebook may still use MySQL as its core engine, but as times change, the team led by Mark Zuckerberg is gradually favoring NoSQL databases, like most well‑known enterprises.

Shift Towards NoSQL Databases

Back in 2014, Facebook representative Jeff Johnson announced at the QCon conference in New York the launch of Apollo, Facebook’s own version of Paxos, a NoSQL‑like database. This new model is a layered storage system where data is stored in fragments, similar to HBase on regional servers.

The distinctive feature of this new NoSQL model is its online low‑latency storage system. It is not a document‑oriented data system. Apollo focuses on modifying data structures, allowing users to change structures without extensive preambles. Individual shards are tiny, ranging from 1 byte to 1 megabyte, while total data size can exceed 10 petabytes. The system can scale from a few servers to nearly a thousand.

Understanding Apollo

Data stored in the Apollo database exists as small fragments, each consisting of four different parts.

The first part is based on Raft, a quorum consensus protocol derived from Stanford’s mature leader election algorithm, giving Apollo a unique edge.

The second part is the storage system, inspired by RocksDB and built on Google’s LevelDB key/value store. Facebook can easily handle this storage, simulating other data structures, including legacy MySQL schemas. However, Apollo’s storage customization is not very friendly, so the database team is working to add MySQL‑like features.

The third part is Facebook’s native API. For Apollo this is a client API with read() and write() components; users must satisfy preconditions, after which Apollo returns the appropriate value.

The fourth part is the Fault‑Tolerant State Machine (FTSM). Each shard runs its own FTSM; if a shard spans three machines, all execute the same code. If one instance crashes, the others continue in an acceptable order, ensuring continuity.

Although Apollo has caused a stir, it is not yet used in production. Facebook hopes to replace some Memcached use cases. Rumors suggest Apollo could become an outbound message queue for iOS devices and carriers, suitable for data analysis and improving extraction speed and accuracy.

Why Does Apollo Exist?

Facebook uses state machines for load balancing, shard generation and management, cross‑machine transaction coordination, and data migration. These machines send RPC requests and, when a data state change is needed, the change must pass through Raft and gain agreement from all servers.

What Is Facebook Currently Using?

Apollo is still “under construction” and far from being Facebook’s daily database. The social‑network giant continues to rely on a heavily modified MySQL installation, enhanced with features such as:

innodb_read_io_threads and innodb_write_io_threads – configuring background I/O threads.

innodb_io_capacity – setting per‑server I/O capacity limits.

innodb_max_merged_io – defining the maximum number of adjacent I/O requests that can be merged.

Unlike Apollo, Facebook currently uses MySQL as a key/value store, distributing data across many logical instances on physical nodes, with load balancing performed at the physical‑node level.

Rumor has it that Facebook operates about 1,800 MySQL servers managed by only three database administrators, a seemingly impossible task, yet the platform continues to run.

MySQL remains simple and powerful; Facebook and similar social‑network sites are unlikely to replace their always‑on MySQL infrastructure with a new NoSQL server for many years.

21CTO Community Comprehensive Report
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlNoSQLFacebookRocksDBApolloRaftdatabase scalability
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.