Operations 14 min read

Facebook's Scalable Continuous Delivery System

This article explains how Facebook built a highly scalable continuous delivery pipeline for both web and mobile products, evolving from simple trunk‑based releases with cherry‑picks to a quasi‑continuous deployment model that supports thousands of engineers, rapid rollouts, and robust quality monitoring.

Continuous Delivery 2.0

Jan 28, 2021

Facebook's Scalable Continuous Delivery System

The software industry has adopted many practices—continuous integration, continuous delivery, agile development, DevOps, and test‑driven development—to enable developers to release code quickly, safely, and in small incremental steps.

Facebook’s development and deployment workflow has evolved into a flexible, pragmatic system that does not strictly follow any single methodology, allowing rapid releases of both web and mobile products.

Before 2017, Facebook used a simple trunk‑based branching strategy with cherry‑picks, pushing the web site three times a day. Engineers would cherry‑pick tested changes from the main branch to a release branch, resulting in 500‑700 cherry‑picks daily and a new weekly release branch containing unpicked changes.

This approach scaled well as the engineering team grew from a few hundred in 2007 to thousands, but manual effort to coordinate daily and weekly deployments became unsustainable. By 2016 the trunk‑based model hit its limits, handling over 1,000 diffs per day and up to 10,000 diffs per week.

In April 2016 Facebook introduced a “quasi‑continuous deployment” mechanism, rolling it out gradually—first to 50% of employees, then to 0.1%, 1%, and finally 10% of production traffic—to validate tools, processes, and ensure no degradation of user experience.

After nearly a year of planning and development, by April 2017 all production servers for the web site were running code directly from the main trunk.

The deployment workflow includes:

Passing a suite of internal automated tests before code can be merged to the trunk.

Pushing the diff to internal users.

If everything is healthy, the change is rolled out to 2% of production, quality signals are collected, and alerts are monitored.

Finally, the change is rolled out to 100% of production, with the Flytrap tool aggregating user reports and alerting on anomalies.

The Gatekeeper feature‑flag system separates feature rollout from code version, allowing quick disabling of problematic features without rolling back code.

Benefits of this quasi‑continuous deployment include eliminating the need for hot‑fixes, providing better support for globally distributed engineering teams, driving improvements in tooling and infrastructure, and delivering a faster, better user experience.

For mobile, Facebook faced additional challenges because many existing tools hinder rapid iteration. They built a dedicated stack—including Buck, Phabricator, React Native, and the static analysis tool Infer—organized into three layers: build, static analysis, and testing.

Each commit triggers builds for multiple products (Facebook, Messenger, Instagram, etc.) across all supported architectures, runs linters and Infer to catch null pointers, leaks, and policy violations, and executes thousands of unit, integration, and end‑to‑end tests (Robolectric, XCTest, JUnit, WebDriver). Android builds alone number 50,000–60,000 per day.

Applying these practices reduced the mobile release cycle from four weeks to one week, using the same branch/cherry‑pick model as the web team, with canary releases to 2% of users and a beta pool of about one million Android testers.

Despite a fifteen‑fold increase in the mobile engineering team and a higher deployment frequency, productivity and code quality remained stable, as shown by consistent issue rates per commit.

Facebook’s release engineering team continues to improve the process, sharing tools and best practices to help other organizations achieve scalable, reliable continuous delivery.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deployment Continuous Delivery Facebook release-engineering Scalable Systems

Written by

Continuous Delivery 2.0

Tech and case studies on organizational management, team management, and engineering efficiency

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.