Why a Million‑Line Monorepo Works: Lessons from Alibaba’s Quick BI

This article shares how Alibaba’s Quick BI team successfully manages a monorepo with over a million lines of TypeScript, achieving fast cold‑starts, efficient code reviews, and scalable architecture through strict standards, automated tooling, and data‑driven process improvements.

Alibaba Terminal Technology
Alibaba Terminal Technology
Alibaba Terminal Technology
Why a Million‑Line Monorepo Works: Lessons from Alibaba’s Quick BI

In recent years Alibaba’s data‑center product Quick BI has grown rapidly, becoming the only domestic BI solution listed in Gartner’s Magic Quadrant for two consecutive years. Its single‑repository source code exceeds one million lines, with 820,000 lines of TypeScript and 180,000 lines of Sass/Less/CSS (excluding generated code).

Key metrics:

Code: 820k TypeScript, 180k styles.

Collaboration: 12,111 code reviews, 53,026 commits.

Despite the large codebase, the team chose a monorepo (single repository) rather than splitting into many repos or adopting micro‑frontend/Serverless approaches. Startup time, which initially took several seconds, later grew to 5‑10 minutes, and was eventually reduced back to about 5 seconds through engineering optimizations.

Why Monorepo?

The team found that a large codebase can be beneficial when supported by a simple architecture, clear standards, close collaboration, and efficient execution. Problems that can be solved by engineering should not be forced into development conventions, and vice‑versa.

Core Monorepo Questions

1. Does a single repository become too large?

The code volume is calculated as source size + .git size + resource files. Assuming 100 characters per line, 1,000,000 lines equal roughly 100 MB of source code. In practice the repository is about 85 MB.

The .git directory stores commit history efficiently; 10,000 commits add only 1–3 MB. Resource files can inflate size, but after cleaning (e.g., using BFG), a 22 GB repo was reduced to 200 MB.

Thus a million‑line codebase typically occupies 200–400 MB, and ten million lines would be around 2–4 GB, comparable to a large node_modules folder.

2. Is startup slow?

Three tactics were applied:

Split the application into multiple entry points, loading only one at a time.

Refine inter‑package dependencies and maximize lazy loading and tree‑shaking.

Replace Webpack with Vite.

After switching to Vite, cold‑start time dropped from 2–5 minutes to under 5 seconds, and hot‑compile time fell from 5 seconds to about 1 second (often <500 ms on Apple M1).

3. How to handle code reuse?

The team avoids excessive DRY; instead, they focus on maintainability. Reusable modules are packaged as separate npm packages (e.g., @alife/bi-designer) and imported selectively via tree‑shaking.

Current Development Experience

Cold start ~5 seconds, hot compile ~1 second.

Changes are isolated to a single line and deployed once.

New developers can set up the environment in ~10 minutes.

Version alignment issues are eliminated.

Engineering upgrades are performed once using a Lerna‑based Pri Monorepo solution.

Problems Still to Solve

Beyond simply merging code, challenges remain in collaboration, technical solutions, and stability (preventing a single commit from breaking the whole product).

1. Package Dependency Management

Packages are arranged with a left‑to‑right, single‑direction dependency graph to avoid cycles. Automated checks enforce this rule.

Open‑source npm packages are introduced only after a three‑person review, due to concerns about long‑term maintenance.

2. Code Review Culture

The team enforces 100 % code review, encouraging small, frequent merge requests and clear, human‑readable code. Reviews are categorized into:

Online MR review (1‑to‑1).

Thematic review (3‑5 participants).

Pre‑release collective review (all).

Best practices include timely reviews, plain language code, standardized directory structures, and avoiding flashy, hard‑to‑maintain techniques.

3. Engineering Automation

Tools such as ESLint, TypeScript type checking, and Prettier enforce syntax and style rules. Webpack is used for production builds, while Vite provides fast development feedback.

4. Performance Optimization

Three focus areas:

Resource loading: fine‑grained tree‑shaking and lazy loading of heavy components.

View rendering: minimize re‑renders, use virtual scrolling for tables.

Data fetching: local caching and PWA techniques for mobile.

Performance monitoring tools alert developers when package size grows.

5. Data‑Driven Architecture Optimization

Metrics on startup time and environment configuration are collected to identify bottlenecks. For example, unifying Node.js versions across the team became visible only after reporting.

Summary and Outlook

A million‑line codebase is not frightening; with proper processes it remains agile. Quick BI is now approaching ten million lines, aiming to become a world‑class BI platform. Future work includes deeper data‑analysis integration, cross‑device support, and further architectural refinements such as introducing Redux‑Toolkit for data flow.

Package dependency diagram
Package dependency diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Monorepofrontend engineeringcodebase management
Alibaba Terminal Technology
Written by

Alibaba Terminal Technology

Official public account of Alibaba Terminal

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.