How CDFuzz’s Targeted Dictionary Boosts Grey‑Box Fuzzing Coverage by 16%

The award‑winning CDFuzz technique introduces a lightweight, targeted dictionary that eliminates extra instrumentation, achieves up to 16.1% higher coverage, discovers dozens of real bugs, and demonstrates that simple optimizations can outperform complex grey‑box fuzzing strategies across diverse file formats.

AntTech
AntTech
AntTech
How CDFuzz’s Targeted Dictionary Boosts Grey‑Box Fuzzing Coverage by 16%

Paper Award and Background

The joint research platform of Southern University of Science and Technology and Ant Group received the ACM SIGSOFT Outstanding Paper Award at ICSE 2025 for the paper "Tumbling Down the Rabbit Hole: How do Assisting Exploration Strategies Facilitate Grey‑box Fuzzing?". The work systematically reveals the effectiveness limits of auxiliary strategies in grey‑box fuzzing and introduces a customized targeted‑dictionary technique called CDFuzz that simultaneously improves coverage and vulnerability discovery.

Award announcement image
Award announcement image

CDFuzz Overview

CDFuzz offers three core advantages:

No extra instrumentation: It automatically extracts key constants from the program’s control‑flow graph (CFG), removing the need for additional instrumentation, symbolic execution, or gradient solving.

Custom targeted dictionary: For each seed, it dynamically generates a dictionary that precisely covers constant‑condition constraints, yielding up to a 16.1% efficiency gain.

Zero compile/runtime overhead: Dictionary generation and application are seamlessly integrated into the fuzzing workflow without extra compilation or execution costs.

Key Findings

Large‑scale experiments on nine fuzzing tools and multiple auxiliary strategies across 21 real‑world projects uncovered three major insights:

Over 90% of constraint breakthroughs involve constant‑comparison types, indicating that deep state exploration is often blocked by input == CONSTANT constraints.

Dictionary‑based strategies outperform expectations; the traditional AFLDict sometimes exceeds symbolic execution tools like QSYM in coverage.

Complex strategies hit depth limits: when constraint depth exceeds 20, symbolic execution success drops to 15%, while dictionary approaches remain unaffected.

Technical Mechanism

CDFuzz implements a two‑stage process to create targeted dictionaries:

Static constant extraction: Using LLVM IR, it parses the program’s CFG to collect all constant values appearing in branch conditions (e.g., 0xdeadbeef, "8BIM").

Dynamic path feedback: Based on the execution path of a given input seed, it selects the subset of constants relevant to the current path constraints and builds a focused dictionary.

CDFuzz workflow diagram
CDFuzz workflow diagram

Experimental Results

In 24‑hour testing sessions, CDFuzz demonstrated significant advantages:

Average coverage increase: 16.1% overall, with a peak improvement of 26.2% on the strip project compared to the best existing strategy (AFL++Dict).

First discovery of 37 real bugs: Including heap overflows and uninitialized memory issues; 9 have been officially confirmed and 7 already patched.

Cross‑format applicability: Stable performance across 10 file formats such as ELF, JPEG, and SQL.

Conclusion and Future Work

CDFuzz replaces heavyweight constraint solvers with a lightweight targeted dictionary, proving that minimal‑overhead auxiliary strategies can dramatically boost fuzzing efficiency. The authors plan to continue exploring lightweight optimizations, further enhance industrial‑grade testing tools, and integrate the approach into Ant Group’s real‑world security infrastructure.

software securitystatic analysisCDFuzzgrey-box fuzzingcoverage improvementtargeted dictionary
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.