hdrake · hdrake · Feb 24, 2026 · Feb 24, 2026 · Feb 24, 2026 · Feb 24, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -18,7 +18,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: ['3.12', '3.13', '3.14']
+        python-version: ['3.11', '3.12', '3.13', '3.14']
 
     steps:
       - name: Cancel previous runs

diff --git a/.gitignore b/.gitignore
@@ -2,4 +2,4 @@ __pycache__
 .ipynb_checkpoints
 /data/*
 .pytest_cache
-.DS_Store
+.DS_Store
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,114 @@
+# Changelog
+
+## 0.7.0 — internals refactor (typed engine)
+
+This release replaces the recursive dict-walking engine with a typed expression
+tree (parse → evaluate). The convention/YAML format is **unchanged**; the
+in-memory representation, the engine, and the default output variable names are
+new. Numerical results are identical to the previous engine — verified by
+end-to-end equivalence tests on the example MOM6 grid (108 → 57 variables) and
+the ECCOv4r4 LLC90 grid (140 → 75 variables, 0 mismatches).
+
+### Quick migration
+
+Add `name_scheme="legacy"` to your `collect_budgets` call to keep the previous
+behavior exactly — historical variable names *and* the in-place filling of the
+recipe dict that `get_vars`/`aggregate` depend on:
+
+```python
+xbudget.collect_budgets(grid, xbudget_dict, name_scheme="legacy")
+```
+
+Everything downstream (old variable names, `get_vars`, `aggregate`) then works
+unchanged. Adopt the new scheme at your own pace.
+
+### Breaking changes
+
+1. **Simplified variable names (default `name_scheme="v1"`).** Derived
+   variables are now named by their term path with the `sum`/`product`/
+   `difference` operator infixes dropped, and the redundant "copy" duplicates
+   the old engine emitted are gone. One variable is produced per operation.
+
+   | Legacy name | New name |
+   |---|---|
+   | `heat_rhs` | `heat_rhs` *(unchanged)* |
+   | `heat_rhs_sum` | `heat_rhs` *(the copy/sum collapse into one)* |
+   | `heat_rhs_sum_diffusion` | `heat_rhs_diffusion` |
+   | `heat_rhs_sum_diffusion_sum_lateral_product` | `heat_rhs_diffusion_lateral` |
+   | `mass_rhs_sum_advection_sum_lateral_sum_zonal_convergence_product_zonal_divergence_difference` | `mass_rhs_advection_lateral_zonal_convergence_zonal_divergence` |
+
+   On the example MOM6 grid this reduces 108 variables to 57. The canonical
+   identity of each variable is also stored structurally in its
+   `xbudget_path` attribute (a list of term names), so you never need to parse
+   the flat name.
+
+2. **`collect_budgets` no longer mutates the recipe dict** (in `v1` mode). It
+   previously filled each node's `var` field in place; it now leaves
+   `xbudget_dict` untouched and returns the data object. Because the legacy
+   `get_vars`/`aggregate` helpers read those filled `var` fields, they only work
+   after a `name_scheme="legacy"` run (which still fills the dict). To query the
+   `v1` output, use the `records`/`alias_map` returned by `evaluate_budgets`
+   (below) and the `provenance` / `xbudget_path` attributes on each variable.
+
+3. **`collect_budgets` signature** gained a `name_scheme` keyword and its first
+   parameter is named `data` (a grid or dataset), matching its long-standing
+   behavior of accepting either.
+
+### New
+
+- `xbudget.parse_budgets(xbudget_dict)` → typed tree (`xbudget.nodes.Budget`),
+  the single schema-validating entry point; raises `xbudget.BudgetParseError`
+  with the offending path on malformed conventions.
+- `xbudget.evaluate_budgets(data, budgets)` → pure evaluator; returns
+  `(alias_map, records)` where `alias_map` maps every legacy name to its new
+  name and `records` maps each new variable to its `{path, op, ...}` metadata.
+- Each derived variable carries `xbudget_path` (structured identity),
+  `xbudget_op` (operation kind), and `provenance` (immediate inputs) attributes.
+- **ECCOv4r4 / LLC90 support in the typed engine.** The `reciprocal` and
+  `lateral_divergence` operations and a `difference` of a *computed sub-term*
+  (not just a raw variable) are all handled, so the native-grid ECCO mass/heat/
+  salt budgets evaluate under `name_scheme="v1"`. New `ECCOV4r4_native`
+  convention and example notebooks (`eccov4r4_budget_examples_mass_heat_salt`,
+  `eccov4r4_heat_budget_decomposition`).
+- **`lateral_divergence` now uses native xgcm** (`grid.diff` with
+  `other_component` + `face_connections`) instead of a hand-rolled LLC90 flux
+  stitcher; verified bit-for-bit identical on the ECCO grid. The
+  `xbudget/llc90` module is removed.
+
+### Fixed
+
+- **ECCO mass budget: the lateral eddy-bolus transport was silently dropped.**
+  The `bolus_mass_flux_convergence` term in `ECCOV4r4_native.yaml` was missing
+  its enclosing `product:` wrapper, so its `sign`/`density`/`volume_flux_divergence`
+  children sat directly on the term and were ignored — the GM bolus velocity
+  (`UVELSTAR`/`VVELSTAR`) contributed nothing to the mass budget. The wrapper is
+  now restored, so the bolus convergence is materialized and included. **This
+  changes ECCO mass-budget results** (the bolus term is no longer zero).
+- The `difference` operation's grid guard was misattached, so a `difference`
+  on a plain `Dataset` raised an opaque `NameError`, and a `difference` term
+  evaluated after another operation in the same node raised spuriously even
+  with a valid grid. It now raises a clear `ValueError` up front when no grid
+  is supplied. (Also fixes a mutable-default-argument footgun in the internal
+  search helper.)
+
+### Deprecated
+
+- `budget_fill_dict` is retained as the legacy reference engine (still used
+  internally by `name_scheme="legacy"`) but is superseded by `collect_budgets`
+  / `evaluate_budgets`.
+
+### Dependencies
+
+- The LLC `lateral_divergence` relies on native face-connected differencing in
+  `xgcm` (`grid.diff` with `other_component`). This is only available in xgcm
+  **after 0.9.0** (currently from the development `main` branch); the
+  `requires-python`/`xgcm` pins should be tightened once a release ships it.
+
+### Parser tolerance
+
+- The parser **warns and skips** unavailable-diagnostic placeholders (e.g. a
+  `difference` whose source is `null`) and terms with stray non-operation keys,
+  mirroring the legacy engine's behavior rather than failing, so real
+  conventions with such placeholders still load. (This same tolerance is what
+  let the malformed bolus term above pass silently before it was fixed — the
+  warning it emitted is what surfaced the bug.)
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,89 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## What this is
+
+`xbudget` wrangles finite-volume budgets (mass, heat, salt) diagnosed from ocean General Circulation Models — primarily MOM6 — into closed budgets using `xarray` and `xgcm`. The library's job is to take a dataset of raw model diagnostics plus a *convention* describing how those diagnostics combine, and materialize every intermediate and aggregate term as a named variable in the dataset.
+
+> This branch refactors the engine internals. The convention/YAML format is unchanged, but the in-memory representation is now a typed expression tree and the default output variable names are simplified. See `CHANGELOG.md` for the migration guide.
+
+## Commands
+
+Tests use `pytest` (no separate build/lint step). The base conda environment may have a NumPy 1.x/2.x mismatch — run tests in the project env (e.g. `docs_env_xbudget`):
+
+```bash
+pytest                                          # full suite
+pytest xbudget/tests/test_parse.py              # one file
+pytest xbudget/tests/test_utilities.py::TestCollectBudgets::test_collect_budgets_basic   # one test
+```
+
+The end-to-end characterization and engine-equivalence tests need the ~600 MB example MOM6 dataset (gitignored, fetched from Zenodo); they **skip** when it is absent. Regenerate the characterization golden after an intended change with `XBUDGET_REGEN_CHARN=1 pytest xbudget/tests/test_characterization.py -s`.
+
+Dev environment (conda + editable install):
+
+```bash
+conda env create -f docs/environment.yml   # or ci/environment.yml for the minimal test env
+conda activate docs_env_xbudget
+pip install -e .
+```
+
+## Core architecture
+
+The central abstraction is the **`xbudget_dict`** — a nested provenance tree (loaded from a YAML *convention* file) describing how to build each budget term from raw diagnostics. It is the public input format. Internally it is parsed into a typed expression tree and evaluated.
+
+### The xbudget_dict tree (input format — unchanged)
+
+Top-level keys are budgets (`mass`, `heat`, `salt`). Each budget has `lhs` and/or `rhs` sub-trees plus metadata keys (`lambda`, `thickness`, `surface_lambda`) that the engine does not interpret. Within a side, terms nest recursively. Every node carries a `var` key (a variable name, or `null` for derived terms) plus optionally one or more **operation** keys:
+
+- `sum` — add the child terms together
+- `product` — multiply child terms (scalar numbers allowed as factors, e.g. `density: 1035.`, `sign: -1.`)
+- `difference` — finite-difference across a grid axis (**requires an `xgcm.Grid`**); the operand is a raw variable *or* a computed sub-term
+- `reciprocal` — safe `1/x` (zeros → inf) of a variable
+- `lateral_divergence` — horizontal flux divergence `div(Fx, Fy)` of two flux sub-terms, via native xgcm (`grid.diff` with `other_component` + `face_connections`); works on face-connected LLC grids
+
+A node may carry more than one operation (e.g. a bulk `product` and an equivalent finer `sum`). Leaf string values (`"areacello"`, `"umo"`) are raw diagnostic names. Conventions live in `xbudget/conventions/*.yaml` — `MOM6.yaml` (canonical; also `MOM6_3Donly`, `MOM6_drift`, `MOM6_surface`) and `ECCOV4r4_native.yaml` (LLC90 native-grid budgets).
+
+### The typed engine (parse → evaluate)
+
+```
+xbudget_dict ──parse_budgets──▶ typed tree (nodes.py) ──evaluate_budgets──▶ derived variables + alias map
+```
+
+- **`nodes.py`** — immutable dataclasses: `Budget`, `Term`, and the operations `Sum`/`Product`/`Difference`/`Reciprocal`/`LateralDivergence` plus `Constant`/`VarRef`. A `Term` carries its structured `path` (its canonical identity) and may hold multiple operations. The native `lateral_divergence` helper lives in `collect.py` and is shared by both engines.
+- **`parse.py`** — `parse_budgets(dict) -> {name: Budget}`. The single source of schema truth; validates and raises `BudgetParseError` naming the offending path on malformed conventions.
+- **`evaluate.py`** — `evaluate_budgets(data, budgets)` walks the tree and materializes **one variable per operation**, named by its term path with operator infixes dropped (e.g. `heat_rhs_diffusion_lateral`). It is pure with respect to the recipe (never mutates it); it only writes derived variables into the dataset. Each variable gets `xbudget_path` (structured identity), `xbudget_op` (the operation kind), and `provenance` (immediate inputs) attrs. Returns `(alias_map, records)` — `alias_map` maps every legacy name to its new name; `records` maps each new variable to its metadata. Dispatch is on node type (`Difference` requires an `xgcm.Grid` in its signature, so a grid-less difference fails fast with a clear error).
+- **`collect.py`** — the public surface:
+  - `collect_budgets(data, xbudget_dict, allow_rechunk=True, name_scheme="v1")` → parses then evaluates. **`v1` (default)** uses the simplified names and does **not** mutate the recipe dict. **`legacy`** reuses `budget_fill_dict` to reproduce the historical operator-suffixed names *and* fill the recipe dict in place.
+  - `budget_fill_dict(...)` → the legacy dict-walking engine, retained as a reference implementation (pinned by the equivalence test) and used by `name_scheme="legacy"`. It mutates both the dataset and the recipe dict.
+  - `aggregate` / `disaggregate` / `get_vars` → dict-based query helpers. **They read the `var` fields that the legacy engine fills**, so they only work after a `name_scheme="legacy"` run. For `v1`, query via the `records`/`alias_map` from `evaluate_budgets` and the `provenance`/`xbudget_path` attrs.
+
+### Key behaviors to know
+
+- **Naming changed (major-version cleanup).** `v1` emits one variable per node/operation with operator infixes dropped; the legacy engine emitted duplicate "copy" variables (108 → 57 on the MOM6 example). Use `name_scheme="legacy"` or the `alias_map` to bridge. `CHANGELOG.md` has the old→new table.
+- **Missing diagnostics are skipped with a `UserWarning`, not an error** — a `sum`/`product` containing missing inputs collapses accordingly, so one convention can serve datasets with different available diagnostics.
+- **`difference` rechunking:** `allow_rechunk=True` (default) temporarily rechunks the difference dimension into a single chunk (required by `grid.diff`) then restores chunking.
+- **Lenient parser.** `parse.py` mirrors the legacy engine: it **warns and skips** unavailable-diagnostic placeholders (e.g. a `null`-source `difference`) and stray non-operation keys instead of failing, so real conventions with such terms still load. (This tolerance previously masked the malformed `bolus_mass_flux_convergence` term in `ECCOV4r4_native.yaml` — missing its `product:` wrapper, so the eddy bolus transport was silently dropped from the mass budget; that has since been fixed in the convention.)
+- **xgcm version:** `lateral_divergence` needs native face-connected differencing, available only in xgcm **after 0.9.0** (currently the dev `main`). Run/test in an env with that xgcm.
+
+### Tests
+
+- `test_parse.py` — parser units + validation; asserts all shipped conventions parse; covers the tolerated-malformation path.
+- `test_evaluate_equivalence.py` — proves the typed engine is numerically identical to the legacy `budget_fill_dict`: a synthetic grid (always), the MOM6 grid, and the **ECCO LLC90 grid** (both gated on their data files; the ECCO case exercises reciprocal, difference-of-sub-term, and native `lateral_divergence`).
+- `test_characterization.py` (+ `characterization_MOM6.json`) — golden snapshot of the typed engine's absolute MOM6 output.
+- `test_utilities.py` — the legacy engine, `aggregate`/`get_vars`/`disaggregate`, and `collect_budgets` behavior.
+
+## Data & examples
+
+- `examples/load_example_model_grid.py` — `load_MOM6_coarsened_diagnostics()` builds a MOM6 `xgcm.Grid` (X/Y center/outer, `areacello` metric). `examples/load_example_ecco_grid.py` — `load_ECCOV4r4_coarsened_diagnostics()` builds the ECCO **LLC90** grid with 13-tile `face_connections`. Both download from Zenodo, cached in `data/` (gitignored; only `data/README.md` tracked).
+- Notebooks: `MOM6_budget_examples_mass_heat_salt.ipynb`; `eccov4r4_budget_examples_mass_heat_salt.ipynb` (ECCO closure); `eccov4r4_heat_budget_decomposition.ipynb` (ECCO heat decomposition). The ECCO notebooks and the MOM6 one call `collect_budgets(..., name_scheme="legacy")` because they use `get_vars`/`aggregate`.
+
+## Pull request workflow
+
+When you push a new commit to a branch that already has an open pull request, update the PR description (the top comment / body) so it stays consistent with the latest commit — don't leave it describing only the original state:
+
+- Refresh the summary so it reflects what the branch does now.
+- If the description contains a task list / checklist, check off (`- [x]`) the items the new commit completed and add entries for any follow-up work it introduced.
+- Reflect scope, naming, or API changes so a reviewer reading only the PR body sees the current truth.
+
+Update it with the GitHub CLI as part of the same push, e.g. `gh pr edit <number> --body-file <path>` (or `--body "..."`), so the description never lags behind the commits.
diff --git a/ci/environment.yml b/ci/environment.yml
@@ -3,7 +3,7 @@ channels:
   - conda-forge
   - nodefaults
 dependencies:
-  - python>=3.12
+  - python>=3.11
   - cftime
   - netcdf4
   - pydap

diff --git a/conda/meta.yaml b/conda/meta.yaml
@@ -1,5 +1,5 @@
 {% set name = "xbudget" %}
-{% set version = "0.6.2" %}
+{% set version = "0.7.0" %}
 {% set python_min = "3.11" %}
 
 package:
@@ -8,7 +8,8 @@ package:
 
 source:
   url: https://pypi.org/packages/source/x/xbudget/xbudget-{{ version }}.tar.gz
-  sha256: 0ab9571aae2196523c0dbc394468567446d61e475624921055a3b1e074c05112
+  # TODO(release): regenerate against the published 0.7.0 sdist.
+  sha256: 0000000000000000000000000000000000000000000000000000000000000000
 
 build:
   noarch: python

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -11,7 +11,7 @@ xbudget expects budgets which have a Left-Hand Side (LHS) equal to a Right-Hand
 
 where :math:`\lambda` is the property density (or tracer concentration), :math:`\mathbf{u}` is the flow velocity, and :math:`\mathbf{F}_{\lambda}` is the sum of all non-advective fluxes of :math:`\lambda`.
 
-xbudget ingests an `xgcm.Grid` object containing the budget diagnostics and uses structured metadata, in the form of a nested dictionary (or `.yaml` file), to close such budgets. While this may seem trivial for use cases in which there is a single flux to keep track of, total non-advective fluxes in general circulation models can be composed of dozens of contributing processes. Since budget diagnostics are often not output as volume-integrated tendencies, xbudget allows for terms to be derived as sums, products, or differences (or some combination of these). For example, ocean heat tendency due to air-sea heat fluxes might be derived from the difference between vertical heat fluxes across depth interfaces, summed over longwave, shortwave, sensible, and latent components of the flux, and multiplied by the ocean cell area.
+xbudget ingests an `xgcm.Grid` object containing the budget diagnostics and uses structured metadata, in the form of a nested dictionary (or `.yaml` file), to close such budgets. While this may seem trivial for use cases in which there is a single flux to keep track of, total non-advective fluxes in general circulation models can be composed of dozens of contributing processes. Since budget diagnostics are often not output as volume-integrated tendencies, xbudget allows for terms to be derived as sums, products, differences, reciprocals, or lateral flux divergences (or some combination of these), including on face-connected grids such as the ECCO LLC90 tiles. For example, ocean heat tendency due to air-sea heat fluxes might be derived from the difference between vertical heat fluxes across depth interfaces, summed over longwave, shortwave, sensible, and latent components of the flux, and multiplied by the ocean cell area.
 
 While drafting a `.yaml` file from scratch for a new model can be daunting, it only needs to be done once -- then closing budgets is a breeze!
 
@@ -22,3 +22,4 @@ While drafting a `.yaml` file from scratch for a new model can be daunting, it o
    installation
    examples/MOM6_budget_examples_mass_heat_salt
    examples/eccov4r4_budget_examples_mass_heat_salt
+   examples/eccov4r4_heat_budget_decomposition
diff --git a/examples/MOM6_budget_examples_mass_heat_salt.ipynb b/examples/MOM6_budget_examples_mass_heat_salt.ipynb
@@ -286,13 +286,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": null,
    "id": "527f1b10",
    "metadata": {},
    "outputs": [],
-   "source": [
-    "xbudget.collect_budgets(grid, xbudget_dict)"
-   ]
+   "source": "# name_scheme=\"legacy\" reproduces the historical variable names (e.g.\n# \"heat_rhs_sum_diffusion_sum_lateral\") and fills the recipe dict in place,\n# which the get_vars/aggregate helpers used below rely on. The default\n# name_scheme=\"v1\" instead uses simplified names (e.g. \"heat_rhs_diffusion_lateral\")\n# and leaves the recipe dict untouched; see the migration notes in CHANGELOG.md.\nxbudget.collect_budgets(grid, xbudget_dict, name_scheme=\"legacy\")"
   },
   {
    "cell_type": "markdown",
@@ -2086,4 +2084,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}