Skip to content

Let regional land shares accept per-region land-to-property ratios (#357)#358

Closed
vahid-ahmadi wants to merge 1 commit intomainfrom
feat/regional-land-to-property-357
Closed

Let regional land shares accept per-region land-to-property ratios (#357)#358
vahid-ahmadi wants to merge 1 commit intomainfrom
feat/regional-land-to-property-357

Conversation

@vahid-ahmadi
Copy link
Copy Markdown
Collaborator

Summary

Fixes the data side of #357. Regional household_land_value targets are currently set proportional to each region's property wealth (dwellings × avg_house_price), which bakes in a uniform national land-to-property ratio. That flattens the London land premium (~80% land share in reality) and overstates rural / commuter-belt areas (~45-55% land share). For LVT analysis by region this is the wrong direction.

Change

  • _compute_regional_shares, _compute_regional_targets, get_targets all gain an optional land_to_property_ratio: dict[str, float] argument. When supplied, each region's contribution to the national household-land total is weighted by its ratio before normalising.
  • Defaults preserve current behaviour exactly — ratios default to uniform 1.0, which cancels out.
  • Missing regions in the ratio map raise KeyError. Silent defaults would reintroduce the Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357 bug in a new disguise.
  • Extra regions in the ratio map (e.g. Wales not in the current CSV) are tolerated.
  • Factored out a pure _regional_shares_from_frame helper so the arithmetic is unit-testable without filesystem access.

What's not in this PR

  • Real ratios. Sourcing per-region values — VOA dwelling value minus ONS reconstruction cost, or Savills residential land-value estimates — is the load-bearing modelling call and should land in its own PR after reviewer sign-off.
  • The PolicyEngine-UK formula change. wealth.land.value.aggregate_household_land_value / wealth.property_wealth in household_land_value.py is still a national scalar. Regional calibration targets alone cannot move household-level land values unless the formula consumes a per-region parameter. That work is tracked in a companion issue I'm filing in the sibling repo immediately after this PR.

The two changes have to land together for the end-to-end bias fix to take effect. Until the PE-UK side consumes a regional parameter, supplying non-uniform ratios here will just make the calibration fight itself.

Test plan

  • pytest policyengine_uk_data/tests/test_regional_land_shares.py — 11/11 pass locally
  • ruff format --check and ruff check — clean

Coverage (11 tests)

  • Default reproduces pre-Regional household land values calibrated ∝ property wealth — need region-specific land-to-property ratios #357 property-wealth shares
  • Uniform ratio is equivalent to default (sanity: scaling cancels)
  • Shares always sum to 1 (with or without ratios)
  • Hand-computed 20/80 arithmetic lock
  • London-heavier ratio measurably raises London's share and lowers the others
  • Missing region in ratio map raises KeyError
  • Extra regions in ratio map are tolerated
  • All-zero ratios raise rather than divide by zero
  • Zero ratio for one region zeros only that region, others renormalise
  • Smoke test against the shipped regional_land_values.csv: shares sum to 1, all 9 English regions present

Closes: nothing yet (follow-up to land real ratios + #357 PE-UK formula change before we mark #357 resolved).
Tracks: #357.

🤖 Generated with Claude Code

)

Regional `household_land_value` targets are currently set proportional
to each region's property wealth (`dwellings × avg_house_price`). That
bakes in a uniform national land-to-property ratio, which flattens the
London land premium (~80% land share in reality) and overstates rural
and commuter-belt areas (~45-55% land share). For LVT analysis by
region, the resulting calibration cannot pull the reweighted FRS toward
a higher London land intensity than the national average.

This PR is the data-side plumbing:

- `_compute_regional_shares`, `_compute_regional_targets` and
  `get_targets` gain an optional `land_to_property_ratio: dict[str, float]`
  argument. When supplied, each region's contribution to the national
  household-land total is weighted by its ratio before normalising.
- Defaults preserve current behaviour exactly (ratios default to
  uniform 1.0, which cancels out).
- Missing regions in the ratio map raise `KeyError` — silent defaults
  would reintroduce the #357 bug in a new disguise.
- Extra regions in the ratio map (e.g. Wales not in this CSV) are
  tolerated so the same ratio dict can drive multiple callers.
- Factored out a pure `_regional_shares_from_frame` helper so the
  arithmetic is unit-testable without filesystem access.

No ratios are shipped with this PR. Sourcing real per-region values
(VOA dwelling value minus ONS reconstruction cost, or Savills
residential land-value estimates) needs modelling-team sign-off and
is deliberately a separate follow-up PR.

A companion change is needed in `policyengine-uk` itself: today
`property_wealth_intensity` in `household_land_value.py` is a national
scalar. Regional calibration alone cannot move household-level land
values unless the formula consumes a per-region parameter. That work
is tracked as a follow-up issue in the sibling repo.

11 new unit tests cover: default reproduces pre-#357 shares, uniform
ratio is equivalent to default, shares always sum to 1, hand-computed
20/80 example, London-heavier ratio measurably raises London's share,
missing region in ratio map raises, extra regions tolerated, all-zero
ratios raise rather than divide by zero, zero for one region zeros
only that region, smoke test on the shipped CSV.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vahid-ahmadi
Copy link
Copy Markdown
Collaborator Author

Closing — the motivating framing in the PR description turned out to be based on a stale view of the PE-UK repo.

PE-UK's household_land_value.py on main already does a region-keyed lookup via np.select, reading per-region ratios from household.wealth.land.intensity.property_wealth_by_region (London 0.85, SOUTH_EAST 0.65, NORTH_EAST 0.42, NORTHERN_IRELAND 0.673, etc., sourced from MHCLG 2023 + UK HPI Dec 2025). That was merged in PolicyEngine/policyengine-uk#1537 (commit e2dff2ac). The companion PE-UK issue PolicyEngine/policyengine-uk#1626 was closed as completed for the same reason.

With the PE-UK side already parametrised, this PR's optional land_to_property_ratio plumbing has no live caller — every callsite uses the default (uniform 1.0), which produces identical output to main. Adding optionality without a consumer is just surface area.

What to do instead — keep #357 open but re-scope it to the narrow alignment question:

Data-side regional targets are still built ∝ property wealth, but PE-UK's formula now applies per-region ratios. Targets and formula may fight each other.

If that mismatch turns out to be real, the fix is probably a five-line change at the get_targets callsite that reads the same MHCLG values PE-UK already uses (either from the PE-UK parameter tree or the same source CSV) and passes them in directly — no generic dict-of-strings abstraction needed.

Also worth chasing independently of the upstream question: the pipeline.py:217 bug feeding total land_value (which includes corporate-via-shareholdings) into the regional household chart. That's distorting the regional pattern on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant