Let regional land shares accept per-region land-to-property ratios (#357) by vahid-ahmadi · Pull Request #358 · PolicyEngine/policyengine-uk-data

vahid-ahmadi · 2026-04-17T13:57:07Z

Summary

Fixes the data side of #357. Regional household_land_value targets are currently set proportional to each region's property wealth (dwellings × avg_house_price), which bakes in a uniform national land-to-property ratio. That flattens the London land premium (~80% land share in reality) and overstates rural / commuter-belt areas (~45-55% land share). For LVT analysis by region this is the wrong direction.

Change

_compute_regional_shares, _compute_regional_targets, get_targets all gain an optional land_to_property_ratio: dict[str, float] argument. When supplied, each region's contribution to the national household-land total is weighted by its ratio before normalising.
Defaults preserve current behaviour exactly — ratios default to uniform 1.0, which cancels out.
Missing regions in the ratio map raise KeyError. Silent defaults would reintroduce the Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357 bug in a new disguise.
Extra regions in the ratio map (e.g. Wales not in the current CSV) are tolerated.
Factored out a pure _regional_shares_from_frame helper so the arithmetic is unit-testable without filesystem access.

What's not in this PR

Real ratios. Sourcing per-region values — VOA dwelling value minus ONS reconstruction cost, or Savills residential land-value estimates — is the load-bearing modelling call and should land in its own PR after reviewer sign-off.
The PolicyEngine-UK formula change. wealth.land.value.aggregate_household_land_value / wealth.property_wealth in household_land_value.py is still a national scalar. Regional calibration targets alone cannot move household-level land values unless the formula consumes a per-region parameter. That work is tracked in a companion issue I'm filing in the sibling repo immediately after this PR.

The two changes have to land together for the end-to-end bias fix to take effect. Until the PE-UK side consumes a regional parameter, supplying non-uniform ratios here will just make the calibration fight itself.

Test plan

pytest policyengine_uk_data/tests/test_regional_land_shares.py — 11/11 pass locally
ruff format --check and ruff check — clean

Coverage (11 tests)

Default reproduces pre-Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357 property-wealth shares
Uniform ratio is equivalent to default (sanity: scaling cancels)
Shares always sum to 1 (with or without ratios)
Hand-computed 20/80 arithmetic lock
London-heavier ratio measurably raises London's share and lowers the others
Missing region in ratio map raises KeyError
Extra regions in ratio map are tolerated
All-zero ratios raise rather than divide by zero
Zero ratio for one region zeros only that region, others renormalise
Smoke test against the shipped regional_land_values.csv: shares sum to 1, all 9 English regions present

Closes: nothing yet (follow-up to land real ratios + #357 PE-UK formula change before we mark #357 resolved).
Tracks: #357.

🤖 Generated with Claude Code

) Regional `household_land_value` targets are currently set proportional to each region's property wealth (`dwellings × avg_house_price`). That bakes in a uniform national land-to-property ratio, which flattens the London land premium (~80% land share in reality) and overstates rural and commuter-belt areas (~45-55% land share). For LVT analysis by region, the resulting calibration cannot pull the reweighted FRS toward a higher London land intensity than the national average. This PR is the data-side plumbing: - `_compute_regional_shares`, `_compute_regional_targets` and `get_targets` gain an optional `land_to_property_ratio: dict[str, float]` argument. When supplied, each region's contribution to the national household-land total is weighted by its ratio before normalising. - Defaults preserve current behaviour exactly (ratios default to uniform 1.0, which cancels out). - Missing regions in the ratio map raise `KeyError` — silent defaults would reintroduce the #357 bug in a new disguise. - Extra regions in the ratio map (e.g. Wales not in this CSV) are tolerated so the same ratio dict can drive multiple callers. - Factored out a pure `_regional_shares_from_frame` helper so the arithmetic is unit-testable without filesystem access. No ratios are shipped with this PR. Sourcing real per-region values (VOA dwelling value minus ONS reconstruction cost, or Savills residential land-value estimates) needs modelling-team sign-off and is deliberately a separate follow-up PR. A companion change is needed in `policyengine-uk` itself: today `property_wealth_intensity` in `household_land_value.py` is a national scalar. Regional calibration alone cannot move household-level land values unless the formula consumes a per-region parameter. That work is tracked as a follow-up issue in the sibling repo. 11 new unit tests cover: default reproduces pre-#357 shares, uniform ratio is equivalent to default, shares always sum to 1, hand-computed 20/80 example, London-heavier ratio measurably raises London's share, missing region in ratio map raises, extra regions tolerated, all-zero ratios raise rather than divide by zero, zero for one region zeros only that region, smoke test on the shipped CSV. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vahid-ahmadi · 2026-04-17T14:06:23Z

Closing — the motivating framing in the PR description turned out to be based on a stale view of the PE-UK repo.

PE-UK's household_land_value.py on main already does a region-keyed lookup via np.select, reading per-region ratios from household.wealth.land.intensity.property_wealth_by_region (London 0.85, SOUTH_EAST 0.65, NORTH_EAST 0.42, NORTHERN_IRELAND 0.673, etc., sourced from MHCLG 2023 + UK HPI Dec 2025). That was merged in PolicyEngine/policyengine-uk#1537 (commit e2dff2ac). The companion PE-UK issue PolicyEngine/policyengine-uk#1626 was closed as completed for the same reason.

With the PE-UK side already parametrised, this PR's optional land_to_property_ratio plumbing has no live caller — every callsite uses the default (uniform 1.0), which produces identical output to main. Adding optionality without a consumer is just surface area.

What to do instead — keep #357 open but re-scope it to the narrow alignment question:

Data-side regional targets are still built ∝ property wealth, but PE-UK's formula now applies per-region ratios. Targets and formula may fight each other.

If that mismatch turns out to be real, the fix is probably a five-line change at the get_targets callsite that reads the same MHCLG values PE-UK already uses (either from the PE-UK parameter tree or the same source CSV) and passes them in directly — no generic dict-of-strings abstraction needed.

Also worth chasing independently of the upstream question: the pipeline.py:217 bug feeding total land_value (which includes corporate-via-shareholdings) into the regional household chart. That's distorting the regional pattern on its own.

This was referenced Apr 17, 2026

Make property_wealth_intensity region-aware (consumes policyengine-uk-data#358) PolicyEngine/policyengine-uk#1626

Closed

Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357

Open

vahid-ahmadi self-assigned this Apr 17, 2026

vahid-ahmadi closed this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let regional land shares accept per-region land-to-property ratios (#357)#358

Let regional land shares accept per-region land-to-property ratios (#357)#358
vahid-ahmadi wants to merge 1 commit intomainfrom
feat/regional-land-to-property-357

vahid-ahmadi commented Apr 17, 2026

Uh oh!

vahid-ahmadi commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vahid-ahmadi commented Apr 17, 2026

Summary

Change

What's not in this PR

Test plan

Coverage (11 tests)

Uh oh!

vahid-ahmadi commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant