Anchor UC/PC/CB takeup flags to FRS-reported receipt#359
Merged
Conversation
This was referenced Apr 17, 2026
FRS respondents who report positive receipt of a benefit are by construction take-up=True. The prior code assigned `would_claim_uc`, `would_claim_pc`, and `would_claim_child_benefit` by pure random draw against the aggregate takeup rate, ignoring that information — which meant a respondent reporting UC receipt could be randomly assigned `would_claim_uc = False`, producing calibration noise. Ports `assign_takeup_with_reported_anchors` from `policyengine-us-data/utils/takeup.py`, pared down to the single-group case (UK doesn't need the US's state-keyed grouping). Reporters are forced to True; non-reporters are filled probabilistically to hit the aggregate target rate across the full population, so the overall takeup share still matches the target. Applied to the three benefit-unit-level flags where FRS has a matching reported column (`universal_credit_reported`, `pension_credit_reported`, `child_benefit_reported`). Other takeup flags (TFC, childcare schemes, SCP) have no FRS-reported counterpart and keep pure-random behaviour. 5 unit tests cover the new helper: pure-random fallback, reporters always True, overall rate close to target, handling when reporters already exceed target, and mask-length validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
de14a45 to
a8c99ae
Compare
MaxGhenis
added a commit
that referenced
this pull request
Apr 19, 2026
The weighted-UK-population drift that motivated #310 has already dropped from ~6.5% to ~1.6% on current main as a side-effect of the data-pipeline improvements landed yesterday (stage-2 QRF #362, TFC target refresh #363, reported-anchor takeup #359). Tightens `test_population` tolerance from 7 % to 3 % to lock in that gain — any future calibration change that regresses back toward the pre-April-2026 overshoot now trips CI instead of silently drifting. Adds a new `test_population_fidelity.py` with four regression tests extracted from the #310 draft: - weighted-total ONS match (3 % tolerance) - household-count sanity range (25-33 M) - non-inflation guard (< 72 M) - country-populations-sum-to-UK consistency Does not include #310's loss-function change or Scotland target removal; those are independent proposals and should be evaluated on their own merits once the practical overshoot is resolved. Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 19, 2026
MaxGhenis
added a commit
that referenced
this pull request
Apr 19, 2026
* Tighten population tolerance and add fidelity tests The weighted-UK-population drift that motivated #310 has already dropped from ~6.5% to ~1.6% on current main as a side-effect of the data-pipeline improvements landed yesterday (stage-2 QRF #362, TFC target refresh #363, reported-anchor takeup #359). Tightens `test_population` tolerance from 7 % to 3 % to lock in that gain — any future calibration change that regresses back toward the pre-April-2026 overshoot now trips CI instead of silently drifting. Adds a new `test_population_fidelity.py` with four regression tests extracted from the #310 draft: - weighted-total ONS match (3 % tolerance) - household-count sanity range (25-33 M) - non-inflation guard (< 72 M) - country-populations-sum-to-UK consistency Does not include #310's loss-function change or Scotland target removal; those are independent proposals and should be evaluated on their own merits once the practical overshoot is resolved. Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Loosen population tolerance 3% -> 4% for stochastic calibration variance First CI run on this branch produced 71.8M (3.31% over target) where yesterday's main build produced 70.97M (1.58%). Stochastic dropout in the calibration optimiser (`dropout_weights(weights, 0.05)`) gives ~1-2 percentage point build-to-build variance on the population total. 4% keeps the regression gate well below the pre-April-2026 overshoot (~6.5%) while not flaking on normal stochastic variance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Previously, the enhanced-FRS pipeline assigned
would_claim_uc,would_claim_pc, andwould_claim_child_benefitby pure random draw against the aggregate takeup rate, ignoring that some respondents actually reported receiving these benefits in the FRS benefits table. A respondent reporting UC receipt could therefore be randomly assignedwould_claim_uc = False, producing calibration noise.Ports
assign_takeup_with_reported_anchorsfrompolicyengine-us-data/utils/takeup.py(the SSI/SNAP pattern) and applies it to the three benefit-unit-level flags that have a matching FRS-reported column.Pattern
universal_credit_reported > 0) →would_claim_uc = Truewith certainty.Scope
Applied to:
would_claim_uc←universal_credit_reportedwould_claim_pc←pension_credit_reportedwould_claim_child_benefit←child_benefit_reportedLeft as pure-random (no FRS-reported counterpart):
would_claim_marriage_allowance,would_claim_tfc, childcare schemes,would_claim_scp,child_benefit_opts_out, TV ownership/evasion, first-time-buyer.Test plan
uvx ruff format --checkcleanTestjob passesPart of PolicyEngine/policyengine-uk#1621 item 2.