Skip to content

Fuzzing Primer

This page explains coverage-guided fuzzing from first principles. If you already know what fuzzing is and want to learn how Vitiate implements it, skip to How Vitiate Works.

Fuzzing is a testing technique where a program is fed automatically generated inputs to find bugs. Instead of writing specific test cases by hand, you describe how to feed input to the code and let a fuzzer generate millions of variations.

The core loop is simple:

  1. Generate an input
  2. Run the target with that input
  3. Observe what happens (crash, timeout, assertion failure)
  4. Repeat

A crash means the input triggered a bug. Save the input, fix the bug, and move on.

Random Testing vs. Coverage-Guided Fuzzing

Section titled “Random Testing vs. Coverage-Guided Fuzzing”

Naive random testing generates inputs from scratch each time. It can find shallow bugs but struggles to reach deep code paths because it has no feedback about what the program is doing internally.

Coverage-guided fuzzing adds a critical feedback loop: the fuzzer instruments the target to track which code paths each input exercises. When an input reaches new code that no previous input covered, it is saved to a corpus. Future inputs are generated by mutating corpus entries - flipping bits, inserting bytes, splicing inputs together.

This creates an evolutionary process. The corpus grows to cover more and more of the program, and mutations explore the neighborhood of each covered path. Over time, the fuzzer reaches code that random generation would take astronomical time to hit.

Coverage-guided fuzzers typically track edge coverage: which branches in the control flow graph were taken. An edge is a transition from one basic block to another. Two inputs that follow different branches through an if statement cover different edges even if they execute the same lines of code.

The corpus is the set of inputs the fuzzer has found interesting (i.e., they covered new edges). It starts with optional seed inputs you provide and grows as the fuzzer discovers new coverage. A well-maintained corpus is valuable - it represents the fuzzer’s accumulated knowledge about the target.

Mutations are small transformations applied to existing corpus entries to generate new inputs:

  • Bit flips and byte substitutions: change individual bytes
  • Block insertion and deletion: add or remove chunks
  • Splicing: combine parts of two corpus entries
  • Dictionary token insertion: inject known-significant byte patterns (like :// for URL parsers)
  • Comparison-guided replacement: learn values from if (x == "expected") comparisons and try them

When an input causes an unhandled exception (or timeout, or assertion failure), the fuzzer saves it as a crash artifact. These files contain the exact bytes that triggered the bug and serve as reproducible test cases.

Fuzzing is most effective on code that:

  • Parses untrusted input: JSON, XML, HTML, URLs, binary protocols, configuration files
  • Performs complex validation: input sanitizers, schema validators, type coercers
  • Handles serialization/deserialization: encode/decode cycles, format converters
  • Has many edge cases: state machines, regular expressions, date/time parsing
  • Has security implications: anything processing data from users, APIs, or the network

Fuzzing complements - but does not replace - unit tests and property-based tests. Unit tests verify specific known behaviors. Property-based tests check invariants across random inputs. Fuzzing explores the unknown: it finds inputs you would never think to test.

Both generate random inputs, but they differ in important ways:

Property-Based TestingCoverage-Guided Fuzzing
FeedbackNone (random generation)Edge coverage guides mutation
CorpusNo persistent stateGrows over time, reused across runs
DepthShallow (relies on generators)Deep (evolves toward new coverage)
SpeedHundreds per secondThousands per second
DurationSeconds per test runMinutes to hours (longer = better)
StrengthsTesting invariants, shrinkingFinding crashes, exploring edge cases

The two techniques are complementary. Use property-based tests for checking invariants across structured inputs. Use fuzzing for finding crashes and exploring the uncharted corners of your code.

  • The Fuzzing Book - comprehensive textbook on fuzzing techniques, from random generation through coverage-guided and grammar-based approaches
  • AFL Technical Details - Michal Zalewski’s original description of the instrumentation, mutation, and scheduling strategies behind AFL
  • Structure-Aware Fuzzing - Google’s guide to fuzzing targets that need structured inputs rather than raw bytes
  • LibAFL Book - documentation for the LibAFL framework that powers Vitiate’s mutation engine