Interview with Baidu QATC Chair Yang Fei on AI Testing Challenges and the Future of QA
In this interview, Baidu QATC chair Yang Fei discusses his career, the evolving scope of quality assurance from code to AI model testing, key challenges such as service quality and model interpretability, practical approaches for defect discovery, continuous evaluation pipelines, and advice for QA professionals' personal growth.
Yang Fei, the head of Baidu's Quality Assurance Technical Committee (QATC) and leader of the TG&EBG QA sub‑TC, introduces himself, noting his graduation from Nanjing University and his long‑term experience at Baidu across distributed computing, storage, cloud, autonomous driving, and various QA platforms.
He explains how Baidu's quality assurance has evolved from focusing solely on program quality—measuring performance, constructing exceptions, and covering unit, module, and end‑to‑end tests—to increasingly emphasizing online service quality, crowdsourced testing, and data annotation, especially as AI becomes central to the business.
The scope of quality is expanding from code quality to service quality and ultimately to service effectiveness.
With weak interpretability of AI models, testing must shift from finding bugs to uncovering defects.
Faster product iterations and code branch changes demand efficient quality assurance.
Higher demands are placed on competitive product evaluation and monitoring.
Testing tools will integrate more tightly with domain knowledge, evolving from simple tools (SET) to tool‑infrastructure ecosystems (SETI).
Hardware testing for scenarios such as acoustics, vision, and sensors introduces new quality control challenges.
Addressing AI model quality, he distinguishes two main assurance approaches: effect evaluation and defect discovery. Effect evaluation requires constructing representative evaluation datasets by abstracting real‑world scenarios into feature vectors and building a scenario library for data selection.
Defect discovery lacks a universal method and must be tailored to specific AI contexts. He outlines three possible paths: building adversarial systems for AI prototypes to expose failure cases; visualizing the AI training process to monitor neuron and network activations for potential defect pathways; and employing conflict detection to actively search boundary‑adjacent scenarios for defects.
From an efficiency perspective, he advocates for a "continuous evaluation" pipeline analogous to CI/CD in traditional software testing, linking data annotation, model development, training, and evaluation to enable traceability, quantification, and incremental efficiency gains across the AI lifecycle.
When asked about personal growth for QA engineers, he advises: think more and reflect after tasks; build confidence through self‑challenge and incremental improvement; and cherish constructive criticism from peers.
He concludes by inviting readers to engage with him, leave questions in the comments, and participate in the ongoing discussion.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
