Skip to content

Anchor UC/PC/CB takeup flags to FRS-reported receipt#359

Merged
MaxGhenis merged 2 commits intomainfrom
add-reported-takeup-anchors
Apr 18, 2026
Merged

Anchor UC/PC/CB takeup flags to FRS-reported receipt#359
MaxGhenis merged 2 commits intomainfrom
add-reported-takeup-anchors

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Previously, the enhanced-FRS pipeline assigned would_claim_uc, would_claim_pc, and would_claim_child_benefit by pure random draw against the aggregate takeup rate, ignoring that some respondents actually reported receiving these benefits in the FRS benefits table. A respondent reporting UC receipt could therefore be randomly assigned would_claim_uc = False, producing calibration noise.

Ports assign_takeup_with_reported_anchors from policyengine-us-data/utils/takeup.py (the SSI/SNAP pattern) and applies it to the three benefit-unit-level flags that have a matching FRS-reported column.

Pattern

  • Reporters (any adult in the benefit unit has universal_credit_reported > 0) → would_claim_uc = True with certainty.
  • Non-reporters → filled probabilistically to hit the aggregate target rate across the full population.

Scope

Applied to:

  • would_claim_ucuniversal_credit_reported
  • would_claim_pcpension_credit_reported
  • would_claim_child_benefitchild_benefit_reported

Left as pure-random (no FRS-reported counterpart): would_claim_marriage_allowance, would_claim_tfc, childcare schemes, would_claim_scp, child_benefit_opts_out, TV ownership/evasion, first-time-buyer.

Test plan

  • 5 unit tests on the ported helper
  • uvx ruff format --check clean
  • CI Test job passes

Part of PolicyEngine/policyengine-uk#1621 item 2.

MaxGhenis and others added 2 commits April 18, 2026 07:41
FRS respondents who report positive receipt of a benefit are by
construction take-up=True. The prior code assigned `would_claim_uc`,
`would_claim_pc`, and `would_claim_child_benefit` by pure random draw
against the aggregate takeup rate, ignoring that information — which
meant a respondent reporting UC receipt could be randomly assigned
`would_claim_uc = False`, producing calibration noise.

Ports `assign_takeup_with_reported_anchors` from
`policyengine-us-data/utils/takeup.py`, pared down to the single-group
case (UK doesn't need the US's state-keyed grouping). Reporters are
forced to True; non-reporters are filled probabilistically to hit the
aggregate target rate across the full population, so the overall
takeup share still matches the target.

Applied to the three benefit-unit-level flags where FRS has a matching
reported column (`universal_credit_reported`, `pension_credit_reported`,
`child_benefit_reported`). Other takeup flags (TFC, childcare schemes,
SCP) have no FRS-reported counterpart and keep pure-random behaviour.

5 unit tests cover the new helper: pure-random fallback, reporters
always True, overall rate close to target, handling when reporters
already exceed target, and mask-length validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the add-reported-takeup-anchors branch from de14a45 to a8c99ae Compare April 18, 2026 11:42
@MaxGhenis MaxGhenis marked this pull request as ready for review April 18, 2026 12:08
@MaxGhenis MaxGhenis merged commit 5056ce7 into main Apr 18, 2026
3 checks passed
@MaxGhenis MaxGhenis deleted the add-reported-takeup-anchors branch April 18, 2026 12:08
MaxGhenis added a commit that referenced this pull request Apr 19, 2026
The weighted-UK-population drift that motivated #310 has already
dropped from ~6.5% to ~1.6% on current main as a side-effect of the
data-pipeline improvements landed yesterday (stage-2 QRF #362, TFC
target refresh #363, reported-anchor takeup #359).

Tightens `test_population` tolerance from 7 % to 3 % to lock in that
gain — any future calibration change that regresses back toward the
pre-April-2026 overshoot now trips CI instead of silently drifting.
Adds a new `test_population_fidelity.py` with four regression tests
extracted from the #310 draft:

- weighted-total ONS match (3 % tolerance)
- household-count sanity range (25-33 M)
- non-inflation guard (< 72 M)
- country-populations-sum-to-UK consistency

Does not include #310's loss-function change or Scotland target
removal; those are independent proposals and should be evaluated on
their own merits once the practical overshoot is resolved.

Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit that referenced this pull request Apr 19, 2026
* Tighten population tolerance and add fidelity tests

The weighted-UK-population drift that motivated #310 has already
dropped from ~6.5% to ~1.6% on current main as a side-effect of the
data-pipeline improvements landed yesterday (stage-2 QRF #362, TFC
target refresh #363, reported-anchor takeup #359).

Tightens `test_population` tolerance from 7 % to 3 % to lock in that
gain — any future calibration change that regresses back toward the
pre-April-2026 overshoot now trips CI instead of silently drifting.
Adds a new `test_population_fidelity.py` with four regression tests
extracted from the #310 draft:

- weighted-total ONS match (3 % tolerance)
- household-count sanity range (25-33 M)
- non-inflation guard (< 72 M)
- country-populations-sum-to-UK consistency

Does not include #310's loss-function change or Scotland target
removal; those are independent proposals and should be evaluated on
their own merits once the practical overshoot is resolved.

Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Loosen population tolerance 3% -> 4% for stochastic calibration variance

First CI run on this branch produced 71.8M (3.31% over target) where
yesterday's main build produced 70.97M (1.58%). Stochastic dropout
in the calibration optimiser (`dropout_weights(weights, 0.05)`) gives
~1-2 percentage point build-to-build variance on the population total.

4% keeps the regression gate well below the pre-April-2026 overshoot
(~6.5%) while not flaking on normal stochastic variance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vahid Ahmadi <va.vahidahmadi@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant