Operations 11 min read

How Google Automates the Hunt for Dead Code in Its Billion‑Line Monorepo

Google’s massive monorepo contains billions of lines of code, and the Sesenmann project automatically identifies and removes dead code by analyzing build dependencies, runtime logs, and test coverage, dramatically cutting maintenance costs while navigating technical and cultural challenges.

21CTO
21CTO
21CTO
How Google Automates the Hunt for Dead Code in Its Billion‑Line Monorepo

The Problem of Dead Code

Large software projects inevitably accumulate "dead code"—modules that are no longer used or have not run for years. Maintaining such code incurs ongoing costs because automated test systems cannot easily distinguish unused code, and manual cleanup consumes valuable engineering time.

Google’s Unique Monorepo

Google stores all of its code in a single, massive repository managed by the Piper system, containing billions of lines contributed by tens of thousands of engineers over decades. This openness enables powerful search and cross‑project updates but also makes large‑scale code removal risky.

Sesenmann: Automatic Code Deletion

The Sesenmann project (German for “grim reaper”) automatically detects unused code, creates code‑review change lists, and deletes the code after approval. It has submitted over 1,000 change lists per week and has already removed about 5% of Google’s C++ code.

Identifying Deletable Code

Google’s internal build system Blaze (the predecessor of Bazel) represents dependencies between binaries, libraries, tests, and source files, allowing engineers to find libraries not linked to any binary. Runtime logs record when internal binaries are executed; lack of recent activity flags a component as a deletion candidate.

Exceptions exist, such as API examples, test‑only libraries, or code without logging signals, so a “shield list” is used to prevent accidental removal of critical components.

Handling Dependency Cycles

To deal with cycles, Google treats libraries and their tests as strongly connected components using Tarjan’s algorithm. This approach marks “alive” nodes and isolates “dead” nodes for removal, though matching tests to libraries can be non‑trivial in practice.

Overcoming Cultural Resistance

Engineers initially resist automated deletions, similar to early skepticism toward unit testing. Google addresses this by crafting clear change descriptions, providing concise supporting documentation, and actively managing feedback to improve acceptance.

Impact and Takeaways

According to Google’s engineering blog, the automated deletion effort yields tens‑fold return on investment, reducing C++ maintenance burden by about 5% and saving substantial engineering effort. Organizations with large monolithic codebases should consider similar automated cleanup strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monoreposoftware maintenanceBuild Systemcode cleanup
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.