Static Analysis with Cppcheck: Custom Rule Development and Practical Experience
This article explains static analysis concepts, why cppcheck was chosen for Baidu's testing workflow, details its architecture, and walks through the creation of custom rules—including a printf misuse detector and a dynamic switch misuse rule—while sharing practical challenges and results.
Static analysis (SA) is a white‑box testing technique that examines source code without executing it, offering advantages over unit testing and code review such as broader problem coverage, independence from developer skill, and higher efficiency.
The Baidu product testing pipeline is serial and lacks trimming, which makes early defect detection crucial; SA can complement existing tests by identifying anomalies at the code level before costly re‑runs.
Among available tools, cppcheck was selected over commercial options like Coverity and Clang Static Analyzer because it is open‑source, easily extensible, and does not require full compilation, making it suitable for large codebases.
Cppcheck processes source files through a simplecpp preprocessor, tokenizes the code, builds a SymbolDatabase, and provides utilities such as Token::Match and a data‑flow engine for value analysis.
Custom rule development follows a pattern: for file‑level checks, override runChecks or runSimplifiedChecks ; for whole‑program analysis, override getFileInfo and analyseWholeProgram , similar to a Map‑Reduce approach.
Example 1 implements a rule that detects misuse of bsl::string::appendf (e.g., calling appendf(title) without a format specifier). The rule scans each function scope, matches the target function call, extracts the format string, checks for format specifiers like %s , and reports a bug when parameters are missing. Sample code:
#include <stdio.h> int main() { char p[100]; p[0] = 0; snprintf(p, 10, "%s aa"); }
During implementation, several cppcheck quirks were discovered: namespace handling can overwrite earlier declarations, child namespaces may not inherit parent symbols, and data‑flow analysis may skip loop bodies.
Example 2 targets incorrect usage of the DYNC_ dynamic switch, which can cause uninitialized variables and crashes when a switch changes state after initialization. Because cppcheck does not compile the code, a full function‑call graph is built manually by traversing class hierarchies and token streams, flagging any DYNC_ token encountered.
The deployed rules have identified over 20 core‑risk bugs in printf‑related functions and more than 20 issues with dynamic switches, demonstrating that static analysis can catch bugs that manual code review often misses.
Future work includes adding rules for sort misuse and array‑bounds checking, with detailed rule specifications and implementations available at the provided internal URL.
Baidu Intelligent Testing
Welcome to follow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.