Skip to content

Fix SPI __main__ crash, parameterise marriage allowance, seed age RNG#349

Merged
MaxGhenis merged 2 commits intomainfrom
fix/spi-main-and-params
Apr 17, 2026
Merged

Fix SPI __main__ crash, parameterise marriage allowance, seed age RNG#349
MaxGhenis merged 2 commits intomainfrom
fix/spi-main-and-params

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

policyengine_uk_data/datasets/spi.py had four small but load-bearing bugs (bug-hunt finding U2):

  • The __main__ block called create_spi(path, year) with two positional args, but the signature required three — running the module directly crashed with TypeError: create_spi() missing 1 required positional argument.
  • Marriage allowance was hardcoded to 1_250 (2020-21), so builds for later fiscal years produced the wrong cap for every row where MAIND == 1.
  • np.random.rand was used unseeded, so age imputation drifted non-deterministically across builds.
  • Unknown GORCODE values silently fell through to SOUTH_EAST, quietly distorting regional income totals.

This PR:

  1. Makes output_file_path optional so two-arg calls and the existing __main__ entry point both work.
  2. Reads the Marriage Allowance cap from policyengine-uk parameters (personal_allowance × marriage_allowance.max, rounded up by rounding_increment — the same formula used in policyengine-uk's marriage_allowance variable) keyed on the fiscal year.
  3. Seeds age imputation via np.random.default_rng(seed) with seed=0 by default.
  4. Defaults unknown region fallback to "UNKNOWN"; legacy "SOUTH_EAST" behaviour is still accessible via unknown_region=....

Adds policyengine_uk_data/tests/test_spi_build.py covering each of the four fixes with a lightweight fake SPI .tab file.

Test plan

  • uv run pytest policyengine_uk_data/tests/test_spi_build.py passes.
  • Signature test catches any regression that removes the __main__-friendly default on output_file_path.
  • Deterministic-RNG test catches any regression that switches back to the global np.random state.
  • MA cap test asserts 2020-21 stays at £1,250 and later years diverge (HMRC-published £1,260 for 2025-26).

Finding U2 from the bug hunt.

@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Self-review: APPROVE. See /tmp/bug-hunt/reviews/uk-data-reviews.md for detail. (GitHub blocks self-approve, so leaving as comment.)

@MaxGhenis MaxGhenis merged commit 03326a1 into main Apr 17, 2026
3 checks passed
@MaxGhenis MaxGhenis deleted the fix/spi-main-and-params branch April 17, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant