13 min read

Agent‑Driven R&D Efficiency: Exploration and Practice at QECon Shenzhen 2026

At QECon Shenzhen 2026, Xiaohongshu's tech team will present five technical talks that showcase how AI agents are applied to architecture risk analysis, change automation, large‑model load‑testing data construction, end‑to‑end testing, and client‑side performance, illustrating concrete engineering solutions and measurable productivity gains.

Xiaohongshu Tech REDtech

May 19, 2026

Agent‑Driven R&D Efficiency: Exploration and Practice at QECon Shenzhen 2026

Large‑model technology is moving from conversational interaction to task‑oriented autonomous execution, and Agentic AI is becoming a key force reshaping software engineering. Xiaohongshu’s quality‑efficiency R&D team adopts an "AI Native" approach to redesign workflows such as architecture risk analysis, change automation, intelligent testing, code knowledge‑base creation, and performance evaluation, turning fragmented manual work into sustainable, intelligent engineering paradigms.

1. Billion‑Token Knowledge Base Generation: Enterprise Wiki Engineering

Speaker: Hao Xubin, AI Engineering Architect (long‑term task / multi‑agent).

AI Coding is shifting focus from whether a model can write code to whether it can continuously perform complex engineering tasks like a human engineer. The core challenge for large repositories is that they are too big to read entirely, and problems span modules and services. CodeWiki addresses this by enabling a group of cooperating agents to read relevant parts of the repository in stages, turning "read the whole repo at once" into "each agent reads its assigned slice and merges the results".

Why "reading the repository" is the foundational ability for long‑term agents, and the bottlenecks of traditional RAG, single‑agent, and multi‑agent solutions.

CodeWiki’s two‑stage core ideas: path compression, code skeleton compression, and Glob‑Pattern slicing to solve token‑hole and task‑boundary problems.

Four‑dimensional shared memory, intelligent fallback, and exhaustive reading to move multi‑agent systems from demo to production.

Engineering pitfalls and gains: AST reference injection, semantic similarity fallback, and large‑scale validation results on real codebases.

2. Multi‑Agent‑Based B/C Data‑Link Change Automation Assurance

Speaker: Li Wei, AI Engineering Architect (data lineage & agent automation).

Field changes in the advertising data pipeline (delivery → ADBus → online index) are high‑frequency, high‑risk actions involving cross‑team collaboration, multi‑platform operations, impact assessment, and approval flows. Traditional manual processes lack systematic traceability and impact analysis.

The ArkAI Prism system uses a Supervisor + Specialist Agent hierarchy to automate, trace, and audit complex cross‑team link changes, aiming to eliminate omission and ordering errors.

Current pain points of BC link field changes: multi‑team, multi‑platform serial execution and the limits of manual collaboration.

ArkAI Prism architecture: Supervisor holds global context for dynamic scheduling; Specialist Agents focus on individual endpoints; modular SKILL hot‑plugging.

Key technical breakthroughs: automatic field lineage analysis, automatic impact assessment and pre‑check interception, end‑to‑end traceability and audit.

Deployment outcomes and lessons: design trade‑offs when applying agents to strong‑rule domains and AI‑driven collaboration via group chats.

3. Large‑Model Load‑Testing Data Construction: From Feasibility to Real‑World Utility

Speaker: Chen Sheng, AI Engineering Architect (load testing & stability).

As large models are deployed in Q&A, search, agents, and content generation, the difficulty of load testing shifts from "how to generate pressure" to "how to construct effective data". Template‑like, homogeneous data only covers ideal scenarios and fails to expose long‑tail loads and complex inputs, leading to misleadingly stable results.

Why traditional load‑testing data methods fail for large‑model scenarios and the essential gap between "usable" and "good" data.

The three challenges of large‑model load‑testing data: reproducing business realism, ensuring coverage, and handling complexity hierarchically.

Solution direction: moving from "request samples" to "load characteristics" via a three‑layer data framework and careful source selection.

Practical pitfalls and reproducible experience: avoiding focus on average distributions and ensuring long‑tail coverage to make load data truly decision‑supporting.

4. LLM‑Driven End‑to‑End Testing System

Speaker: Xiao Jun, AI Engineering Architect (quality measurability & intelligent testing).

End‑to‑end testing spans multiple domains (product, transaction, payment) and suffers from combinatorial explosion; manual test case generation can grow from minutes to days. Static orchestration cannot keep up with dynamic business composition.

Xiaohongshu lets agents directly perceive business APIs, autonomously plan execution chains, and employ reverse chain inference plus Debug‑first adaptive execution to bypass static orchestration limits.

Fundamental conflict: static orchestration vs. dynamic business composition space.

Agent breakthrough: reverse chain inference to generate real‑time execution plans; a hybrid Plan‑and‑Execute × ReAct reasoning model.

Key engineering implementations: progressive loading of sub‑domain knowledge bases, Debug‑first script generation paradigm, dual‑layer experience accumulation (tool‑level + link‑level).

Two core insights: prompts describe business problems, not language puzzles; dual memory (knowledge base + experience base) is crucial for agent generalization and stability.

5. AI‑Driven Client‑Side Performance Experience Full‑Link Assurance

Speaker: Zhang Yu, AI Engineering Architect (performance experience).

Client performance is a core competitive factor for social/content apps, yet testing faces four major challenges: delayed issue discovery, mismatch between measurement and user perception, gaps between code analysis and test execution, and lack of unified evaluation standards.

The presented AI‑driven full‑link performance testing system covers the entire loop from code change risk prediction, automated test case generation, execution, perception measurement, to experience scoring, continuously refined by crowd‑testing feedback.

Four performance testing challenges: metric‑perception gap and code‑analysis‑execution discontinuity.

Six‑layer full‑link architecture: white‑box code scanning, intelligent case generation, automated execution, specialized analysis, AI‑based metrics, and experience scoring.

Three‑layer AI visual metric fusion: traditional image processing → CV model → MLLM fine‑filtering, balancing accuracy and inference cost.

Design trade‑offs for the experience scoring system: consolidating scattered indicators into a unified score, selecting metrics, weighting scenarios, and degradation judgment mechanisms.

The conference invites participants to explore these AI‑native engineering practices and discuss the future of AI‑driven software productivity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance data pipeline Automation LLM testing software engineering AI Agent

Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.