simdjsonpy parses untrusted input. We tried.
tests/test_fuzz_properties.py uses Hypothesis to generate JSON values, drive them through the parser, and assert that the result round-trips against the standard library:
uv sync --group fuzz
uv run python -m unittest tests.test_fuzz_properties -vThree properties are checked:
- DOM round-trip matches
json.loads - On-Demand round-trip matches
json.loads - Streaming APIs (
parse_many/iterate_many) match the per-document output ofjson.loads
There is also a "raw bytes never leak C++ exceptions" property: arbitrary bytes are fed to every public entry point, and the test asserts that nothing escapes as a non-Python exception.
fuzz/run_api_stress.py seeds from the simdjson example corpus, mutates each seed, and calls every public API with the mutated payload across multiple worker threads.
uv run python fuzz/run_api_stress.py --rounds 5000 --workers 4Use this together with a sanitizer build to catch memory and threading bugs.
The CMake project exposes three toggles, one at a time:
# AddressSanitizer
uv run python -m pip install -e . -Ccmake.define.SIMDJSONPY_ENABLE_ASAN=ON
# UndefinedBehaviorSanitizer
uv run python -m pip install -e . -Ccmake.define.SIMDJSONPY_ENABLE_UBSAN=ON
# ThreadSanitizer
uv run python -m pip install -e . -Ccmake.define.SIMDJSONPY_ENABLE_TSAN=ONSanitizer builds disables LTO. Run the full test suite plus the stress runner under each:
uv run python -m unittest discover -s tests
uv run python fuzz/run_api_stress.py --rounds 5000 --workers 4ThreadSanitizer exercises the per-instance locks under concurrent load.
For environments where AddressSanitizer is unavailable:
valgrind --tool=memcheck --error-exitcode=1 \
python fuzz/run_api_stress.py --rounds 200 --workers 1- Use-after-free across stream advances. Mitigated by the generation/revision counter on each stream item; advancing the iterator marks earlier items invalid and a subsequent read raises
ReferenceError. - Buffer races on
bytearray/memoryviewinputs. Mitigated by going throughPyObject_GetBuffer(... PyBUF_CONTIG_RO)which acquires a buffer lock that prevents concurrent resize. - Document size DoS. Bounded by the parser's
max_capacity(defaultSIMDJSON_MAXSIZE_BYTES, 4 GiB). Override per-parser if your environment needs a tighter cap. - Nesting-depth DoS. Bounded by
max_depth(defaultDEFAULT_MAX_DEPTH). Configurable per-parser.