R&D Management 29 min read

How Ant Group Scaled R&D Efficiency with a Data‑Driven Insight Platform

Ant Group built a comprehensive R&D Insight system that combines measurement infrastructure, a unified metric framework, and a comprehensive evaluation model to turn massive development data into actionable diagnostics, enabling company‑wide, team‑level, and outsourcing efficiency improvements across thousands of engineers.

DevOpsClub
DevOpsClub
DevOpsClub
How Ant Group Scaled R&D Efficiency with a Data‑Driven Insight Platform

1. Overview

Ant Group (referred to as "Ant") has over ten thousand R&D engineers, and improving overall R&D efficiency comprehensively, precisely, and effectively is a major challenge. The CEO stated that anything not measurable cannot be improved, so R&D efficiency requires continuous metric systems, data collection, problem identification, and automation tools to enable high‑quality, high‑efficiency work.

Based on this goal and years of big‑data technology, Ant built the "Ant R&D Insight System", which includes:

R&D measurement infrastructure (Insight platform)

R&D metric system

R&D comprehensive evaluation model

The system has been practiced for three years (2019‑2021) and achieved data‑driven efficiency gains:

R&D problem data‑fication: experts capture experience online, automatic diagnosis.

Insight service scaling: weekly automatic health checks for all teams.

Intelligent R&D decision‑making: data‑and‑model assisted decisions for efficiency, outsourcing, performance, promotion, etc.

2. Problems and Challenges

Ant’s business combines finance and internet, requiring both stability and speed. A small failure can become a crisis, and rapid innovation is essential. By 2018 the toolchain was standardized, online, and service‑oriented, but the number of engineers grew to over ten thousand, creating new challenges in scaling efficiency.

Key challenges at large scale include:

Difficulty understanding and guiding team‑wide efficiency to meet business expectations.

Team leads (TL) can only get a rough sense of R&D status from morning meetings; problems are often discovered only after business complaints.

Front‑line engineers may not fully follow complex quality policies, missing valuable practices like code review and automated testing.

Traditional solutions rely on external experts (SQA, QA, PM) who are reactive and subjective.

3. Solution and Implementation

3.1 R&D Measurement Infrastructure

The goal is to provide automated, scalable, scenario‑driven services for R&D data.

Three service roles:

Domain experts: analysts (similar to doctors) who diagnose team problems.

CTO and TL: receive concise diagnostic reports like a health‑check summary.

Front‑line engineers: get alerts via messaging about critical issues.

Core product provides different views for each role, with a unified “indicator detail” module that shows analysis conclusions and visualizations.

3.2 Ant R&D Metric System

The metric system aims to unify standards across the company, supporting multiple perspectives, goals, and uses.

R&D process modeling – standardize the development process.

Design indicators – map to process models and define problem domains.

Design models – aggregate complex issues into comprehensive evaluation models.

Apply indicators to concrete scenarios – generate reports for specific use cases.

Key design principles:

Follow basic measurement principles.

Layered measurement (business, delivery, capability layers).

Distinguish indicator attributes (quality, efficiency, input, output) and types (result vs. process).

Typical result indicators include:

[Quality] Faults per ten thousand lines of responsibility: measures online faults relative to changed code lines.

[Quality] Fault detection rate via monitoring: proportion of responsibility faults discovered by monitoring.

[Quality] Application rollback rate: proportion of releases that required rollback.

[Quality] Objective quality score for active applications: composite score covering security, test pass rate, documentation, code duplication, etc.

[Efficiency] Average delivery cycle for requirements: time from requirement intake to production.

[Efficiency] Average iteration delivery cycle: time from iteration creation to release.

[Efficiency] Average lead time before release: time from code commit to successful production run.

3.3 Comprehensive Evaluation Model System

The system quantifies expert observations using Multi‑Criteria Decision Analysis (MCDA) to produce a single score for complex problems, providing clear, explainable conclusions.

Four steps: define evaluation indicators, score each indicator, determine weights (Delphi, AHP, etc.), and compute the aggregate score.

4. Practice and Effects

Two main practice categories:

Data‑assisted R&D improvement – embed data reports into daily workflows.

Data‑enabled domain decisions – provide reports for cross‑functional governance such as outsourcing management.

Four concrete practices:

4.1 Company‑wide efficiency

Defined a company‑level R&D efficiency index (14 result indicators) and published analysis reports, leading to a 14% annual efficiency increase, 20% release quality improvement, and 15% faster iteration.

4.2 Team‑level improvement

Weekly team health checks and issue tracking achieved >95% expert problem discovery and >80% problem resolution rates.

4.3 R&D activity insight

Analyzed 54 core R&D activities, 222 engineering tasks, and identified bottlenecks to guide tool improvements.

4.4 Outsourcing efficiency

Provided a metric‑tree dashboard for outsourcing evaluation, supporting supplier assessment and team performance.

5. System Panorama

The full ecosystem combines multi‑role collaboration, expert‑system‑driven data products, and a multi‑side platform business model, enabling self‑operating, self‑organizing, data‑driven R&D improvement.

DevOpsR&D metricsdata-drivenPerformance engineeringSoftware measurement
DevOpsClub
Written by

DevOpsClub

Personal account of Mr. Zhang Le (Le Shen @ DevOpsClub). Shares DevOps frameworks, methods, technologies, practices, tools, and success stories from internet and large traditional enterprises, aiming to disseminate advanced software engineering practices, drive industry adoption, and boost enterprise IT efficiency and organizational performance.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.