Fuzzing Primer

This page explains coverage-guided fuzzing from first principles. If you already know what fuzzing is and want to learn how Vitiate implements it, skip to How Vitiate Works.

What Is Fuzzing?

Fuzzing is a testing technique where a program is fed automatically generated inputs to find bugs. Instead of writing specific test cases by hand, you describe how to feed input to the code and let a fuzzer generate millions of variations.

The core loop is simple:

Generate an input
Run the target with that input
Observe what happens (crash, timeout, assertion failure)
Repeat

A crash means the input triggered a bug. Save the input, fix the bug, and move on.

Random Testing vs. Coverage-Guided Fuzzing

Naive random testing generates inputs from scratch each time. It can find shallow bugs but struggles to reach deep code paths because it has no feedback about what the program is doing internally.

Coverage-guided fuzzing adds a critical feedback loop: the fuzzer instruments the target to track which code paths each input exercises. When an input reaches new code that no previous input covered, it is saved to a corpus. Future inputs are generated by mutating corpus entries - flipping bits, inserting bytes, splicing inputs together.

This creates an evolutionary process. The corpus grows to cover more and more of the program, and mutations explore the neighborhood of each covered path. Over time, the fuzzer reaches code that random generation would take astronomical time to hit.

Key Concepts

Edge Coverage

Coverage-guided fuzzers typically track edge coverage: which branches in the control flow graph were taken. An edge is a transition from one basic block to another. Two inputs that follow different branches through an if statement cover different edges even if they execute the same lines of code.

Corpus

The corpus is the set of inputs the fuzzer has found interesting (i.e., they covered new edges). It starts with optional seed inputs you provide and grows as the fuzzer discovers new coverage. A well-maintained corpus is valuable - it represents the fuzzer’s accumulated knowledge about the target.

Mutations

Mutations are small transformations applied to existing corpus entries to generate new inputs:

Bit flips and byte substitutions: change individual bytes
Block insertion and deletion: add or remove chunks
Splicing: combine parts of two corpus entries
Dictionary token insertion: inject known-significant byte patterns (like :// for URL parsers)
Comparison-guided replacement: learn values from if (x == "expected") comparisons and try them

Crash Artifacts

When an input causes an unhandled exception (or timeout, or assertion failure), the fuzzer saves it as a crash artifact. These files contain the exact bytes that triggered the bug and serve as reproducible test cases.

When Is Fuzzing Most Useful?

Fuzzing is most effective on code that:

Parses untrusted input: JSON, XML, HTML, URLs, binary protocols, configuration files
Performs complex validation: input sanitizers, schema validators, type coercers
Handles serialization/deserialization: encode/decode cycles, format converters
Has many edge cases: state machines, regular expressions, date/time parsing
Has security implications: anything processing data from users, APIs, or the network

Fuzzing complements - but does not replace - unit tests and property-based tests. Unit tests verify specific known behaviors. Property-based tests check invariants across random inputs. Fuzzing explores the unknown: it finds inputs you would never think to test.

Fuzzing vs. Property-Based Testing

Both generate random inputs, but they differ in important ways:

	Property-Based Testing	Coverage-Guided Fuzzing
Feedback	None (random generation)	Edge coverage guides mutation
Corpus	No persistent state	Grows over time, reused across runs
Depth	Shallow (relies on generators)	Deep (evolves toward new coverage)
Speed	Hundreds per second	Thousands per second
Duration	Seconds per test run	Minutes to hours (longer = better)
Strengths	Testing invariants, shrinking	Finding crashes, exploring edge cases

The two techniques are complementary. Use property-based tests for checking invariants across structured inputs. Use fuzzing for finding crashes and exploring the uncharted corners of your code.