How AI Transforms Bug Prediction, Understanding, and Automatic Repair

This article outlines a three‑track roadmap for tackling software defects—starting with bug prediction using machine‑learning models, progressing through defect understanding and localization via static and dynamic analysis, and culminating in automated repair and program synthesis techniques, while highlighting key research directions and representative literature.

Taobao Frontend Technology
Taobao Frontend Technology
Taobao Frontend Technology
How AI Transforms Bug Prediction, Understanding, and Automatic Repair

From the roadmap perspective, we can identify roughly three pathways:

The first starts from the bug itself: first predict bugs, then understand them, and finally attempt automatic bug repair. The second starts from the program, since bugs are fundamentally program issues; understanding the program is essential, covering program understanding, program analysis, program self‑repair, and program synthesis. The third is a more generic direction that leverages other tools, such as constructing a software knowledge graph based on knowledge‑graph technology.

After outlining the overall map, we will examine the details of each branch.

Defect Prediction

The first step for bugs is to predict whether bugs exist, how many there might be, how severe they are, and how much time and manpower are needed to resolve them.

We currently lack a deep understanding of bugs themselves, which will be addressed in the defect understanding stage. Prediction mainly employs machine‑learning and statistical methods. Although defect prediction is simpler than later stages, it still involves many challenges. To build a real‑time defect‑prediction system, we need to monitor code changes, extract various metrics, and perform statistical analysis and model training.

Defect Understanding

Prediction results often lack interpretability. Therefore, we analyze test cases, perform static analysis, and examine relationships among them to understand the root causes of defects.

Defect Localization and Repair

After prediction and understanding, we face the problem directly: locating and fixing defects.

Traditional localization techniques include logging, assertions, breakpoints, debugging, and profiling.

Advanced techniques involve program spectrum analysis using test coverage, program‑analysis‑based methods, and machine‑learning‑driven data‑mining approaches.

New directions exploit information‑retrieval‑based localization, which can use text similarity methods or deep neural networks and machine‑translation techniques for semantic similarity comparison.

Once the problem is located, software automatic repair techniques can find or generate patches.

Automatic repair is roughly divided into four categories: heuristic search, human‑crafted templates, semantic constraints, and statistical analysis. Since defects are primarily code‑related, we must understand both defects and code.

Program Understanding

Program understanding, similar to defect localization, employs static analysis, dynamic analysis, and machine learning.

Program Analysis

Program analysis is a core technical approach for code.

The main directions are listed below; detailed discussion will follow later, and some directions require foundational knowledge and academic training.

Program Automated Repair

Program automated repair relies on program‑analysis knowledge and rule‑based analysis.

If a complete specification exists—i.e., we have a clear understanding and definition of the problem—we can perform repair based on classification.

Without a complete specification, we can try using contracts as specifications or manually write program contracts.

If those are insufficient, we resort to test‑suite‑based program repair.

Program Intelligent Synthesis

Bug repair ends here, but programs can go further by using program synthesis techniques for intelligent generation.

We can learn from examples, synthesize based on code frameworks or rules, and even leverage natural‑language‑processing techniques for synthesis.

References

This field has become a hot research area in recent years, with many Chinese survey articles and numerous English papers.

宫丽娜,姜淑娟,姜丽. 软件缺陷预测技术研究进展. 软件学报, 2019, 30(10):3090-3114. http://www.jos.org.cn/1000-9825/5790.htm

蔡亮,范元瑞,鄢萌,夏鑫. 即时软件缺陷预测研究进展. 软件学报, 2019, 30(5):1288−1307. http://www.jos.org.cn/1000-9825/5713.htm

李斌,贺也平,马恒太. 程序自动修复:关键问题及技术. 软件学报, 2019, 30(2):244−265. http://www.jos.org.cn/1000-9825/5657.htm

金芝,刘芳,李戈. 程序理解:现状与未来. 软件学报, 2019, 30(1):110-126. http://www.jos.org.cn/1000-9825/5643.htm

张健,张超,玄跻峰,熊英飞,王千祥,梁彬,李炼,窦文生,陈振邦,陈立前,蔡彦. 程序分析研究进展. 软件学报, 2019, 30(1):80-109. http://www.jos.org.cn/1000-9825/5651.htm

李晓卓,贺也平,马恒太. 缺陷理解研究:现状、问题与发展. 软件学报, 2020, 31(1):20-46. http://www.jos.org.cn/1000-9825/5887.htm

顾斌, 于波, 董晓刚, 李晓锋, 钟睿明, 杨孟飞. 程序智能合成技术研究进展. 软件学报. http://www.jos.org.cn/1000-9825/6200.htm

李政亮,陈翔,蒋智威,顾庆. 基于信息检索的软件缺陷定位方法综述. 软件学报, 2021, 32(2):247−276. http://www.jos.org.cn/1000-9825/6130.htm

Wong WE, Gao RZ, Li YH, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42(8): 707-740. doi:10.1109/TSE.2016.2521368

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

software engineeringprogram synthesisprogram repairbug predictiondefect localization
Taobao Frontend Technology
Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.