How AI Transforms Bug Prediction, Understanding, and Automatic Repair
This article outlines a three‑track roadmap for tackling software defects—starting with bug prediction using machine‑learning models, progressing through defect understanding and localization via static and dynamic analysis, and culminating in automated repair and program synthesis techniques, while highlighting key research directions and representative literature.
From the roadmap perspective, we can identify roughly three pathways:
The first starts from the bug itself: first predict bugs, then understand them, and finally attempt automatic bug repair. The second starts from the program, since bugs are fundamentally program issues; understanding the program is essential, covering program understanding, program analysis, program self‑repair, and program synthesis. The third is a more generic direction that leverages other tools, such as constructing a software knowledge graph based on knowledge‑graph technology.
After outlining the overall map, we will examine the details of each branch.
Defect Prediction
The first step for bugs is to predict whether bugs exist, how many there might be, how severe they are, and how much time and manpower are needed to resolve them.
We currently lack a deep understanding of bugs themselves, which will be addressed in the defect understanding stage. Prediction mainly employs machine‑learning and statistical methods. Although defect prediction is simpler than later stages, it still involves many challenges. To build a real‑time defect‑prediction system, we need to monitor code changes, extract various metrics, and perform statistical analysis and model training.
Defect Understanding
Prediction results often lack interpretability. Therefore, we analyze test cases, perform static analysis, and examine relationships among them to understand the root causes of defects.
Defect Localization and Repair
After prediction and understanding, we face the problem directly: locating and fixing defects.
Traditional localization techniques include logging, assertions, breakpoints, debugging, and profiling.
Advanced techniques involve program spectrum analysis using test coverage, program‑analysis‑based methods, and machine‑learning‑driven data‑mining approaches.
New directions exploit information‑retrieval‑based localization, which can use text similarity methods or deep neural networks and machine‑translation techniques for semantic similarity comparison.
Once the problem is located, software automatic repair techniques can find or generate patches.
Automatic repair is roughly divided into four categories: heuristic search, human‑crafted templates, semantic constraints, and statistical analysis. Since defects are primarily code‑related, we must understand both defects and code.
Program Understanding
Program understanding, similar to defect localization, employs static analysis, dynamic analysis, and machine learning.
Program Analysis
Program analysis is a core technical approach for code.
The main directions are listed below; detailed discussion will follow later, and some directions require foundational knowledge and academic training.
Program Automated Repair
Program automated repair relies on program‑analysis knowledge and rule‑based analysis.
If a complete specification exists—i.e., we have a clear understanding and definition of the problem—we can perform repair based on classification.
Without a complete specification, we can try using contracts as specifications or manually write program contracts.
If those are insufficient, we resort to test‑suite‑based program repair.
Program Intelligent Synthesis
Bug repair ends here, but programs can go further by using program synthesis techniques for intelligent generation.
We can learn from examples, synthesize based on code frameworks or rules, and even leverage natural‑language‑processing techniques for synthesis.
References
This field has become a hot research area in recent years, with many Chinese survey articles and numerous English papers.
宫丽娜,姜淑娟,姜丽. 软件缺陷预测技术研究进展. 软件学报, 2019, 30(10):3090-3114. http://www.jos.org.cn/1000-9825/5790.htm
蔡亮,范元瑞,鄢萌,夏鑫. 即时软件缺陷预测研究进展. 软件学报, 2019, 30(5):1288−1307. http://www.jos.org.cn/1000-9825/5713.htm
李斌,贺也平,马恒太. 程序自动修复:关键问题及技术. 软件学报, 2019, 30(2):244−265. http://www.jos.org.cn/1000-9825/5657.htm
金芝,刘芳,李戈. 程序理解:现状与未来. 软件学报, 2019, 30(1):110-126. http://www.jos.org.cn/1000-9825/5643.htm
张健,张超,玄跻峰,熊英飞,王千祥,梁彬,李炼,窦文生,陈振邦,陈立前,蔡彦. 程序分析研究进展. 软件学报, 2019, 30(1):80-109. http://www.jos.org.cn/1000-9825/5651.htm
李晓卓,贺也平,马恒太. 缺陷理解研究:现状、问题与发展. 软件学报, 2020, 31(1):20-46. http://www.jos.org.cn/1000-9825/5887.htm
顾斌, 于波, 董晓刚, 李晓锋, 钟睿明, 杨孟飞. 程序智能合成技术研究进展. 软件学报. http://www.jos.org.cn/1000-9825/6200.htm
李政亮,陈翔,蒋智威,顾庆. 基于信息检索的软件缺陷定位方法综述. 软件学报, 2021, 32(2):247−276. http://www.jos.org.cn/1000-9825/6130.htm
Wong WE, Gao RZ, Li YH, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42(8): 707-740. doi:10.1109/TSE.2016.2521368
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Taobao Frontend Technology
The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
