1--- 2layout: default 3title: Integrating a Python project 4parent: Setting up a new project 5grand_parent: Getting started 6nav_order: 3 7permalink: /getting-started/new-project-guide/python-lang/ 8--- 9 10# Integrating a Python project 11{: .no_toc} 12 13- TOC 14{:toc} 15--- 16 17 18The process of integrating a project written in Python with OSS-Fuzz is very 19similar to the general 20[Setting up a new project]({{ site.baseurl }}/getting-started/new-project-guide/) 21process. The key specifics of integrating a Python project are outlined below. 22 23## Atheris 24 25Python fuzzing in OSS-Fuzz depends on 26[Atheris](https://github.com/google/atheris). Fuzzers will depend on the 27`atheris` package, and dependencies are pre-installed on the OSS-Fuzz base 28docker images. 29 30## Project files 31 32### Example project 33 34We recommend viewing [ujson](https://github.com/google/oss-fuzz/tree/master/projects/ujson) as an 35example of a simple Python fuzzing project, with both plain-Atheris and 36Atheris + Hypothesis harnesses. 37 38### project.yaml 39 40The `language` attribute must be specified. 41 42```yaml 43language: python 44``` 45 46The only supported fuzzing engine is libFuzzer (`libfuzzer`). The supported 47sanitizers are AddressSanitizer (`address`) and 48UndefinedBehaviorSanitizer (`undefined`). These must be explicitly specified. 49 50```yaml 51fuzzing_engines: 52 - libfuzzer 53sanitizers: 54 - address 55 - undefined 56``` 57 58### Dockerfile 59 60Because most dependencies are already pre-installed on the images, no 61significant changes are needed in the Dockerfile for Python fuzzing projects. 62You should simply clone the project, set a `WORKDIR`, and copy any necessary 63files, or install any project-specific dependencies here as you normally would. 64 65### build.sh 66 67For Python projects, `build.sh` does need some more significant modifications 68over normal projects. The following is an annotated example build script, 69explaining why each step is necessary and when they can be omitted. 70 71```sh 72# Build and install project (using current CFLAGS, CXXFLAGS). This is required 73# for projects with C extensions so that they're built with the proper flags. 74pip3 install . 75 76# Build fuzzers into $OUT. These could be detected in other ways. 77for fuzzer in $(find $SRC -name '*_fuzzer.py'); do 78 fuzzer_basename=$(basename -s .py $fuzzer) 79 fuzzer_package=${fuzzer_basename}.pkg 80 81 # To avoid issues with Python version conflicts, or changes in environment 82 # over time on the OSS-Fuzz bots, we use pyinstaller to create a standalone 83 # package. Though not necessarily required for reproducing issues, this is 84 # required to keep fuzzers working properly in OSS-Fuzz. 85 pyinstaller --distpath $OUT --onefile --name $fuzzer_package $fuzzer 86 87 # Create execution wrapper. Atheris requires that certain libraries are 88 # preloaded, so this is also done here to ensure compatibility and simplify 89 # test case reproduction. Since this helper script is what OSS-Fuzz will 90 # actually execute, it is also always required. 91 # NOTE: If you are fuzzing python-only code and do not have native C/C++ 92 # extensions, then remove the LD_PRELOAD line below as preloading sanitizer 93 # library is not required and can lead to unexpected startup crashes. 94 echo "#!/bin/sh 95# LLVMFuzzerTestOneInput for fuzzer detection. 96this_dir=\$(dirname \"\$0\") 97LD_PRELOAD=\$this_dir/sanitizer_with_fuzzer.so \ 98ASAN_OPTIONS=\$ASAN_OPTIONS:symbolize=1:external_symbolizer_path=\$this_dir/llvm-symbolizer:detect_leaks=0 \ 99\$this_dir/$fuzzer_package \$@" > $OUT/$fuzzer_basename 100 chmod u+x $OUT/$fuzzer_basename 101done 102``` 103 104## Hypothesis 105 106Using [Hypothesis](https://hypothesis.readthedocs.io/), the Python library for 107[property-based testing](https://hypothesis.works/articles/what-is-property-based-testing/), 108makes it really easy to generate complex inputs - whether in traditional test suites 109or [by using test functions as fuzz harnesses](https://hypothesis.readthedocs.io/en/latest/details.html#use-with-external-fuzzers). 110 111> Property based testing is the construction of tests such that, when these tests are fuzzed, 112 failures in the test reveal problems with the system under test that could not have been 113 revealed by direct fuzzing of that system. 114 115We recommend using the [`hypothesis write`](https://hypothesis.readthedocs.io/en/latest/ghostwriter.html) 116command to generate a starter fuzz harness. This "ghostwritten" code may be usable as-is, 117or provide a useful template for writing more specific tests. 118 119See [here for the core "strategies"](https://hypothesis.readthedocs.io/en/latest/data.html), 120for arbitrary data, [here for Numpy + Pandas support](https://hypothesis.readthedocs.io/en/latest/numpy.html), 121or [here for a variety of third-party extensions](https://hypothesis.readthedocs.io/en/latest/strategies.html) 122supporting everything from protobufs, to jsonschemas, to networkx graphs or geojson 123or valid Python source code. 124Hypothesis' integrated test-case reduction also makes it trivial to report a canonical minimal 125example for each distinct failure discovered while fuzzing - just run the test function! 126 127To use Hypothesis in OSS-Fuzz, install it in your Dockerfile with 128 129```shell 130RUN pip3 install hypothesis 131``` 132 133See [the `ujson` structured fuzzer](https://github.com/google/oss-fuzz/blob/master/projects/ujson/hypothesis_structured_fuzzer.py) 134for an example "polyglot" which can either be run with `pytest` as a standard test function, 135or run with OSS-Fuzz as a fuzz harness. 136