Use this skill to fuzz open source Python software projects using Atheris.
This skill provides the agent with the knowledge and tools to write, build, and validate fuzz targets for Python projects integrated into OSS-Fuzz. Python fuzzing uses Atheris, which wraps libFuzzer and instruments Python bytecode for coverage-guided fuzzing.
Python projects must use the Python base builder image:
FROM gcr.io/oss-fuzz-base/base-builder-python
Set language: python in project.yaml.
A Python fuzz target is a .py file (named fuzz_<target>.py by convention)
that follows this pattern:
#!/usr/bin/python3
import sys
import atheris
# Import the module under test after atheris.instrument_imports() or within
# the atheris.instrument_all() context so bytecode is instrumented.
def TestOneInput(data):
fdp = atheris.FuzzedDataProvider(data)
# Extract typed values from the raw fuzzer bytes.
value = fdp.ConsumeString(128)
try:
my_module.parse(value)
except (ValueError, TypeError, KeyError):
# Expected exceptions from invalid input are not bugs.
pass
def main():
atheris.instrument_all() # instrument all loaded Python modules
atheris.Setup(sys.argv, TestOneInput, enable_python_coverage=True)
atheris.Fuzz()
if __name__ == "__main__":
main()
When instrument_all() is too broad (e.g. causes conflicts with C extensions),
instrument specific modules using the instrument_imports() context manager:
import atheris
with atheris.instrument_imports():
import my_module
import my_module.subpackage
atheris.FuzzedDataProvider splits the raw byte stream into typed values:
| Method | Description |
|---|---|
ConsumeBytes(count) | bytes of length count |
ConsumeByteList(count) | list[int] of length count |
ConsumeString(count) | decoded str (may contain surrogates) |
ConsumeUnicode(count) | str without surrogates |
ConsumeUnicodeNoSurrogates(count) | strict str |
ConsumeInt(nbytes) | signed int from nbytes |
ConsumeIntInRange(min, max) | int in range |
ConsumeFloat() | float |
ConsumeBool() | bool |
ConsumeIntList(count, nbytes) | list of ints |
PickValueInList(lst) | random element |
ConsumeRemainingBytes() | all remaining bytes |
build.sh installs the target package and uses the compile_python_fuzzer
helper to turn each fuzz_*.py file into a standalone fuzzer binary in $OUT:
# build.sh
# Install the package under test.
pip3 install .
# Compile all fuzz targets found in $SRC.
for fuzzer in $(find $SRC -name 'fuzz_*.py'); do
compile_python_fuzzer "$fuzzer"
done
compile_python_fuzzer handles linking against Atheris and libFuzzer and
produces an executable in $OUT named after the .py file.
$OUT/<fuzzer_name>_seed_corpus/ or zip them as
$OUT/<fuzzer_name>_seed_corpus.zip.$OUT/<fuzzer_name>.dict — especially valuable for
text-format parsers (JSON, XML, YAML, CSV, etc.).instrument_all() for simplicity or
instrument_imports() for targeted instrumentation. Without instrumentation
coverage guidance is blind.try/except for the
documented exception types the target raises on bad input. Only unexpected
exceptions and hard crashes are findings.FuzzedDataProvider for structured input rather than feeding raw
bytes directly to APIs that expect text — most Python APIs work on strings,
not bytes.TestOneInput: imports should happen
at module level (inside the instrument_imports() block if used) so they
are instrumented and not re-executed per iteration.TestOneInput.random, no datetime.now(), no
os.urandom() inside the fuzz function.enable_python_coverage=True to
atheris.Setup for bytecode-level coverage tracking.Python is memory-safe, so the focus is on:
ValueError, RecursionError, MemoryError,
UnicodeDecodeError, and any exception the library should have caught and
converted to a clean error.AssertionError: internal invariant violations triggered by crafted
input.python3 infra/helper.py build_fuzzers <project>
python3 infra/helper.py check_build <project>
python3 infra/helper.py run_fuzzer <project> <fuzzer_name> -- -max_total_time=30
python3 fuzz_target.py) on a sample input to
debug before building through OSS-Fuzz.python3 fuzz_target.py <seed_file>
Atheris supports running in single-input mode outside libFuzzer.pip3 install .) and iterate quickly before
going through the Docker build.RUN git clone to COPY to avoid network round-trips.