Let regional land shares accept per-region land-to-property ratios (#357)#358
Let regional land shares accept per-region land-to-property ratios (#357)#358vahid-ahmadi wants to merge 1 commit intomainfrom
Conversation
) Regional `household_land_value` targets are currently set proportional to each region's property wealth (`dwellings × avg_house_price`). That bakes in a uniform national land-to-property ratio, which flattens the London land premium (~80% land share in reality) and overstates rural and commuter-belt areas (~45-55% land share). For LVT analysis by region, the resulting calibration cannot pull the reweighted FRS toward a higher London land intensity than the national average. This PR is the data-side plumbing: - `_compute_regional_shares`, `_compute_regional_targets` and `get_targets` gain an optional `land_to_property_ratio: dict[str, float]` argument. When supplied, each region's contribution to the national household-land total is weighted by its ratio before normalising. - Defaults preserve current behaviour exactly (ratios default to uniform 1.0, which cancels out). - Missing regions in the ratio map raise `KeyError` — silent defaults would reintroduce the #357 bug in a new disguise. - Extra regions in the ratio map (e.g. Wales not in this CSV) are tolerated so the same ratio dict can drive multiple callers. - Factored out a pure `_regional_shares_from_frame` helper so the arithmetic is unit-testable without filesystem access. No ratios are shipped with this PR. Sourcing real per-region values (VOA dwelling value minus ONS reconstruction cost, or Savills residential land-value estimates) needs modelling-team sign-off and is deliberately a separate follow-up PR. A companion change is needed in `policyengine-uk` itself: today `property_wealth_intensity` in `household_land_value.py` is a national scalar. Regional calibration alone cannot move household-level land values unless the formula consumes a per-region parameter. That work is tracked as a follow-up issue in the sibling repo. 11 new unit tests cover: default reproduces pre-#357 shares, uniform ratio is equivalent to default, shares always sum to 1, hand-computed 20/80 example, London-heavier ratio measurably raises London's share, missing region in ratio map raises, extra regions tolerated, all-zero ratios raise rather than divide by zero, zero for one region zeros only that region, smoke test on the shipped CSV. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Closing — the motivating framing in the PR description turned out to be based on a stale view of the PE-UK repo. PE-UK's With the PE-UK side already parametrised, this PR's optional What to do instead — keep #357 open but re-scope it to the narrow alignment question:
If that mismatch turns out to be real, the fix is probably a five-line change at the Also worth chasing independently of the upstream question: the |
Summary
Fixes the data side of #357. Regional
household_land_valuetargets are currently set proportional to each region's property wealth (dwellings × avg_house_price), which bakes in a uniform national land-to-property ratio. That flattens the London land premium (~80% land share in reality) and overstates rural / commuter-belt areas (~45-55% land share). For LVT analysis by region this is the wrong direction.Change
_compute_regional_shares,_compute_regional_targets,get_targetsall gain an optionalland_to_property_ratio: dict[str, float]argument. When supplied, each region's contribution to the national household-land total is weighted by its ratio before normalising.KeyError. Silent defaults would reintroduce the Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357 bug in a new disguise._regional_shares_from_framehelper so the arithmetic is unit-testable without filesystem access.What's not in this PR
wealth.land.value.aggregate_household_land_value / wealth.property_wealthinhousehold_land_value.pyis still a national scalar. Regional calibration targets alone cannot move household-level land values unless the formula consumes a per-region parameter. That work is tracked in a companion issue I'm filing in the sibling repo immediately after this PR.The two changes have to land together for the end-to-end bias fix to take effect. Until the PE-UK side consumes a regional parameter, supplying non-uniform ratios here will just make the calibration fight itself.
Test plan
pytest policyengine_uk_data/tests/test_regional_land_shares.py— 11/11 pass locallyruff format --checkandruff check— cleanCoverage (11 tests)
KeyErrorregional_land_values.csv: shares sum to 1, all 9 English regions presentCloses: nothing yet (follow-up to land real ratios + #357 PE-UK formula change before we mark #357 resolved).
Tracks: #357.
🤖 Generated with Claude Code