Operations 13 min read

How Facebook Scaled Continuous Delivery for Web and Mobile at Massive Scale

This article explains how Facebook evolved its release engineering process from manual cherry‑picking to a quasi‑continuous push‑from‑master system, enabling thousands of daily code changes, reducing hot‑patches, supporting global engineers, and improving user experience across web and mobile platforms.

21CTO
21CTO
21CTO
How Facebook Scaled Continuous Delivery for Web and Mobile at Massive Scale

As the software industry has introduced many methods—continuous integration, continuous delivery, agile development, DevOps, and test‑driven development—to deliver code faster, safer, and more reliably, the common goal is to let developers provide small, incremental changes to users quickly and correctly.

Facebook’s development and deployment process organically grew to cover most of these rapid‑iteration techniques without relying on any single technology, allowing the company to ship web and mobile products quickly.

For years Facebook used a simple main‑program release‑branch strategy that deployed code three times a day. Engineers cherry‑picked changes that passed automated tests from the main branch to a daily release branch, typically selecting 500‑700 changes per day, while the remaining changes went into a weekly release branch.

As the team grew from a few engineers in 2007 to thousands, the speed of code delivery scaled proportionally, but manual effort from release engineers became a bottleneck. By 2016 the branch/cherry‑pick model hit its limits, handling over 1,000 daily changes and up to 10,000 weekly changes, requiring unsustainable manual coordination.

In April 2016 Facebook moved facebook.com to a quasi‑continuous “push‑from‑master” system. Over the next year the rollout progressed from 50 % of engineers using the new code, to 0.1 % of production, then 1 %, then 10 %, and finally, after a year of planning and development, 100 % of production ran code directly deployed from master within three days in April 2017.

Large‑Scale Continuous Delivery

A true continuous delivery system would release every code change immediately, but Facebook’s commit rate required a system that could handle dozens to hundreds of changes every few hours. In this quasi‑continuous mode, changes are small and incremental, rarely affecting user experience, and are deployed in layers over a few hours to 100 % of production, allowing quick roll‑back if problems arise.

Changes first pass a suite of automated internal tests before reaching the master branch. If a regression is detected, a push‑block alert is raised and an emergency stop button can halt further releases. Successful changes are pushed to 2 % of production for additional monitoring, then finally to 100 % of production using a tool called Flytrap that collects user reports and alerts on anomalies.

The Gatekeeper system controls many changes, enabling independent releases of mobile and web code without relying on new features and reducing risk. If an issue is found, Gatekeeper can be disabled without rolling back or fixing the current version.

The quasi‑continuous release cycle offers several benefits:

1. Eliminates the need for hot‑patches

With three daily deployments, urgent changes no longer require disruptive hot‑patches; they can be merged to master and released in the next scheduled version.

2. Better support for a global engineering team

Instead of forcing all engineers to focus on a single weekly release time that may be inconvenient across time zones, the new system lets engineers worldwide develop and deliver code as needed.

3. Drives development of next‑generation tools, automation, and processes

The project served as a pressure test across multiple teams, leading to improvements in push tools, diff‑review tools, testing infrastructure, capacity management, traffic routing, and more, preparing the company for future scaling.

4. Improves user experience and speed

Engineers receive feedback on their changes within hours rather than weeks, enabling faster bug fixes and enhancements, and allowing the infrastructure to better handle rare events that affect users.

Continuous Delivery to Mobile

While a quasi‑continuous system was feasible for the web stack, mobile platforms faced additional challenges due to existing tooling. Facebook addressed this by open‑sourcing a suite of mobile‑focused tools—including Nuclide, Buck, Phabricator, various iOS libraries, React Native, and Infer—to build high‑quality code for rapid mobile deployment.

The mobile CI stack consists of three layers: build, static analysis, and testing.

When developers commit to the mobile master branch, the code is built for all affected products (Facebook, Messenger, Pages Manager, Instagram, etc.) across multiple architectures and simulators.

During the build, Infer runs static analysis to catch null‑pointer exceptions, resource and memory leaks, unused variables, risky system calls, and violations of Facebook’s coding standards.

The third layer runs mobile automation tests, including thousands of unit, integration, and end‑to‑end tests driven by Robolectric, XCTest, JUnit, and WebDriver.

Each commit triggers the build and test stack multiple times throughout the code’s lifecycle; on Android alone, 50,000–60,000 builds are performed daily.

Applying traditional continuous delivery techniques to the mobile stack has reduced release cadence from four‑week cycles to two‑week and now weekly releases. Although only one version is released per week, early testing ensures engineers receive rapid feedback. Approximately one million Android beta testers receive new mobile candidate builds daily.

Despite a 15‑fold increase in the mobile engineering team and a higher code delivery speed, productivity measured by lines of code or push count remained stable from 2012 to 2016, and the number of critical issues per mobile release did not increase, indicating maintained code quality.

Compiled by: wrm Link: https://code.facebook.com
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Continuous DeliveryDeployment Automationrelease-engineeringfacebook engineeringsoftware scalability
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.