20 min read

Can AI Safely Write Code for High‑Risk Backend Systems? Lessons from Tencent’s CDN

This article analyses how Tencent applied AI coding to its massive, high‑risk CDN LEGO backend, built a Rust‑based Nonstop proxy to probe AI limits, designed a five‑layer Harness Engineering framework with multi‑model adversarial review, identified concrete failure modes, and quantified efficiency gains while redefining developer roles.

Tencent Architect

Apr 22, 2026

Can AI Safely Write Code for High‑Risk Backend Systems? Lessons from Tencent’s CDN

Background and Challenge

When AI‑coding hype focuses on front‑end page generation, the question of whether AI can safely write code for "mission‑critical" backend systems is often ignored. Tencent’s CDN LEGO project exemplifies such a system: over 1 million lines of core C++ code, more than 3 million lines of third‑party libraries (OpenSSL, QUIC, Lua, JavaScript, etc.), serving billions of requests daily. The combination of uncontrolled clients, diverse source‑sites, multiple protocols, and a full‑stack, non‑blocking asynchronous architecture yields a theoretical configuration space of 13,824 × N, where a single mistake could cause a global outage.

Industry Landscape

Recent AI‑coding case studies show impressive results on smaller services, but their applicability to ultra‑large, highly variable backends remains unproven. Tencent therefore set out to explore two complementary paths.

20‑Day Rust Nonstop Proxy Project

Using a single developer plus an AI‑assisted team, the team built a Rust‑implemented "Nonstop" proxy in 20 days to probe AI’s coding boundaries. The proxy supports full L4/L7 proxying, HTTP/3 + QUIC, an integrated WAF, V8 JavaScript workers for edge computing, and single‑binary zero‑downtime hot‑loading. Benchmarks showed 42,052 QPS at 5,000 concurrent connections, 0 errors, and a P50 latency of 1.1 ms, with six layers of defense.

Key Findings from the Nonstop Experiment

AI can generate functional, high‑performance code quickly.

However, AI often refuses to admit uncertainty, fabricates function signatures, RFC sections, or percentages, and makes partial edits that miss global side‑effects.

Root cause: AI lacks "uncertainty awareness" and a global view of the system.

Harness Engineering for the LEGO Project

To move from "AI can write" to "AI writes safely", Tencent designed a five‑layer Harness Engineering architecture centred on three pillars: context, constraints, and feedback.

Core Idea: constrain AI to a single module, file, or function, providing explicit context and rules, then enforce feedback loops that validate every generated line before it reaches production.

Five‑Layer Architecture

Layer 1 – Permission & Security Base.

Layer 2 – Code Rules as a Compiler (static analysis, style, safety policies).

Layer 3 – Process Constraints (functional implementation → unit test → code review, each step blocks the next).

Layer 4 – Knowledge Context (project constitution, domain‑expert knowledge, RFC corpus).

Layer 5 – Continuous Feedback (hooks, pitfall journal, inline CLAUDE.md comments).

Concrete Constraints (Derived from Real Pitfalls)

Task: 功能实现
└─ blocks: [单元测试]
   ← 测试 Task 被功能 Task 阻塞
Task: 单元测试
├─ blockedBy: [功能实现]
│  ← 功能完成后才能写测试
├─ blocks: [代码审查]
│  ← 测试完成后才能审查
Task: 代码审查
└─ blockedBy: [功能实现, 单元测试]
   ← 两个都完成才能审查

Five core constraints, each sourced from a real incident, proved to be roughly 100× more effective than vague expectations.

Multi‑Model Adversarial Code Review (CR)

Three independent models – Claude, Codex, and Gemini – review each AI‑generated change. Their findings are merged by cr_manager into cr_report.md. Overlaps increase confidence (e.g., issue a2 discovered by both Model A and Model B), while unique findings trigger further verification.

Feedback Loops

Channel 1: Automatic hook collection from runtime.

Channel 2: Pitfall Journal that records real incidents (e.g., PIT‑001: mmap nullptr → SIGSEGV leads to rule R2: check system‑call return values).

Channel 3: Inline CLAUDE.md feedback that updates the knowledge base.

This closed‑loop turns each bug into a rule, each rule into a reusable Skill, and each Skill into a barrier that prevents the same mistake.

Practice Cases

CPUInfos Read/Write Race Fix

AI identified three candidate solutions (ReadWriteLock, atomic, double‑buffer + atomic index). After A/B testing, the zero‑overhead solution was chosen, compressing development time from five days to one. The fix succeeded, but later revealed a missing thread‑initialisation step, highlighting the need for comprehensive validation.

Efficiency Gains

Overall, Harness Engineering delivered a ~20 % efficiency improvement across the LEGO project, while accumulating 86 k lines of code, 31 Skills, 34 pitfall rules, and 4 parallel competitor‑research tracks. Knowledge assets grew into a sustainable engineering flywheel.

Role Evolution in the AI‑Coding Era

Junior developers become AI operators mastering Skills and Prompts; senior developers become Harness Engineers designing constraints; architects shift to human‑AI collaboration design; QA engineers become AI Quality Engineers; security engineers become AI Security Experts. The unchanging core capability is abstract thinking: deciding what to delegate to AI, how to verify it, and how to encode the verification.

Team‑Building Roadmap

Months 1‑2 – "Learn": onboard all engineers with the full workflow, adversarial CR, and 14 safety rules.

Months 2‑4 – "Build": core members author team‑specific Skills, validate them via A/B experiments, and share them.

Months 4‑12 – "Evolve": automate Harness processes, enable cross‑team knowledge sharing, and continuously track AI impact.

Attitude Toward AI Coding

Caution: every AI‑generated line must be reviewed.

Push: aggressively apply AI to high‑frequency scenarios.

Embrace: embed AI practices into team culture while retaining deep understanding of underlying principles.

Conclusion

AI coding is not about replacing engineers but about redefining the engineering paradigm. Tencent’s LEGO Harness Engineering demonstrates that a systematic, constraint‑driven, feedback‑rich approach can turn AI from a risky experiment into a reliable co‑author, turning each pitfall into a rule, each rule into a Skill, and each Skill into a lasting competitive advantage.

backend development AI coding software engineering large‑scale systems AI safety Harness Engineering

Written by

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background and Challenge

Industry Landscape

20‑Day Rust Nonstop Proxy Project

Key Findings from the Nonstop Experiment

Harness Engineering for the LEGO Project

Five‑Layer Architecture

Concrete Constraints (Derived from Real Pitfalls)

Multi‑Model Adversarial Code Review (CR)

Feedback Loops

Practice Cases

CPUInfos Read/Write Race Fix

Efficiency Gains

Role Evolution in the AI‑Coding Era

Team‑Building Roadmap

Attitude Toward AI Coding

Conclusion

Tencent Architect

How this landed with the community

Was this worth your time?

0 Comments

20‑Day Rust Nonstop Proxy Project