How Google Automates the Hunt for Dead Code in Its Billion‑Line Monorepo
Google’s massive monorepo contains billions of lines of code, and the Sesenmann project automatically identifies and removes dead code by analyzing build dependencies, runtime logs, and test coverage, dramatically cutting maintenance costs while navigating technical and cultural challenges.
The Problem of Dead Code
Large software projects inevitably accumulate "dead code"—modules that are no longer used or have not run for years. Maintaining such code incurs ongoing costs because automated test systems cannot easily distinguish unused code, and manual cleanup consumes valuable engineering time.
Google’s Unique Monorepo
Google stores all of its code in a single, massive repository managed by the Piper system, containing billions of lines contributed by tens of thousands of engineers over decades. This openness enables powerful search and cross‑project updates but also makes large‑scale code removal risky.
Sesenmann: Automatic Code Deletion
The Sesenmann project (German for “grim reaper”) automatically detects unused code, creates code‑review change lists, and deletes the code after approval. It has submitted over 1,000 change lists per week and has already removed about 5% of Google’s C++ code.
Identifying Deletable Code
Google’s internal build system Blaze (the predecessor of Bazel) represents dependencies between binaries, libraries, tests, and source files, allowing engineers to find libraries not linked to any binary. Runtime logs record when internal binaries are executed; lack of recent activity flags a component as a deletion candidate.
Exceptions exist, such as API examples, test‑only libraries, or code without logging signals, so a “shield list” is used to prevent accidental removal of critical components.
Handling Dependency Cycles
To deal with cycles, Google treats libraries and their tests as strongly connected components using Tarjan’s algorithm. This approach marks “alive” nodes and isolates “dead” nodes for removal, though matching tests to libraries can be non‑trivial in practice.
Overcoming Cultural Resistance
Engineers initially resist automated deletions, similar to early skepticism toward unit testing. Google addresses this by crafting clear change descriptions, providing concise supporting documentation, and actively managing feedback to improve acceptance.
Impact and Takeaways
According to Google’s engineering blog, the automated deletion effort yields tens‑fold return on investment, reducing C++ maintenance burden by about 5% and saving substantial engineering effort. Organizations with large monolithic codebases should consider similar automated cleanup strategies.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
