Operations 20 min read

Why Google and Facebook Skip Docker: Lessons from Monolithic Repos and Layered Packaging

The article explains how Google and Facebook’s monolithic repositories and unified build systems let them avoid Docker images by using direct module transfer, tarballs, XAR files, and overlay filesystems, while highlighting the technical trade‑offs and challenges of layered caching in large‑scale clusters.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Why Google and Facebook Skip Docker: Lessons from Monolithic Repos and Layered Packaging

Background

All technical details mentioned can be found in open‑source projects and research papers. The author wrote this after trying to speed up a modified distributed PyTorch program on Facebook’s clusters, illustrating the knowledge required for industrial machine learning.

After graduating in 2007, the author worked at Google for three years, admiring the Borg distributed OS. Leaving Google in 2010, they awaited an open‑source version until Kubernetes appeared.

Kubernetes vs Borg Terminology

Kubernetes schedules containers (precisely “集装箱”) that run images , analogous to processes running programs. Borg never exposed containers or images, which raises the question why Kubernetes introduced them.

Monolithic Repository Insight

Both Google and Facebook use a monolithic repository with a unified build system (Google Blaze/Bazel, Facebook Buck). When code is stored in a single repo, a unified build system can directly sync changed modules to cluster nodes without creating Docker images, ZIP, tarball, RPM, or DEB packages.

Packaging Options

Tarball : Simple packaging by zipping files (e.g., {A,B,C}.py, {D,E,F}.so) into A.zip or A.tar.gz. Version numbers (e.g., A-953bc.zip) enable cache reuse.

XAR : Facebook’s XAR format wraps a SquashFS loopback image with a header. After building with Buck, A.xar contains the files and can be mounted with xarexec -m A-953bc.xar to obtain a temporary mount point. xarexec -m A-953bc.xar Multiple XAR files can be layered (e.g., A-953bc.xar → B-953bc.xar → D-953bc.xar …), but they cannot be mounted sequentially to the same point because each mount occupies the target directory.

Overlay Filesystem

Using fuse-overlayfs, several directories can be overlaid into one view:

fuse-overlayfs -o lowerdir="/tmp/A-953bc:/tmp/B-953bc:..." /packages/A-953bc

The lower directories are the mount points of the XAR files, effectively making each XAR a layer.

Docker Image and Layers

Docker images consist of multiple layers stored via an overlay filesystem. When pulling an image, cached layers are skipped, saving bandwidth. Docker’s overlayfs (or overlayfs2) runs in kernel mode, requiring root privileges, which raises security concerns.

FUSE‑based overlayfs (e.g., fuse-overlayfs) can be used as an alternative, though with lower performance.

Why Google and Facebook Do Not Use Docker

Because their monolithic repos and build systems can directly transfer compiled modules, they do not need packaging concepts like Docker images. Historically, they built fully static binaries, eliminating the need for .so libraries or containers.

Languages such as Java (using JAR), Python (using PAR/subpar), and Go (static linking) fit this model. However, static linking leads to large binaries and longer rebuild times, so most other companies prefer layered Docker images for cache efficiency.

Technical Challenge of a Perfect Solution

A perfect solution should support layered or chunked caching while handling the granularity mismatch between build‑system modules and higher‑level projects. For C/C++, many .so files would create too many layers, slowing startup due to symbol resolution.

One approach is graph partitioning: combine modules into sub‑graphs, compile each sub‑graph into a static archive ( .a), link them into a single shared library ( .so) per sub‑graph, and use those as cache units.

References

https://engineering.fb.com/2019/06/06/data-center-engineering/twine/

https://zhuanlan.zhihu.com/p/55452964

https://bazel.build/

https://buck.build/

https://github.com/facebookincubator/xar

https://tldp.org/HOWTO/SquashFS-HOWTO/creatingandusing.html

https://docs.docker.com/storage/storagedriver/select-storage-driver/

https://github.com/google/subpar

Original source: https://zhuanlan.zhihu.com/p/368676698

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DockerContainerpackagingBuild Systemoverlayfsmonolithic-repo
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.