Skip to content

externalize observation weights from Store layer (closes #28)#40

Merged
schroedk merged 6 commits into
mainfrom
refactor/issue-28-externalize-weights
May 11, 2026
Merged

externalize observation weights from Store layer (closes #28)#40
schroedk merged 6 commits into
mainfrom
refactor/issue-28-externalize-weights

Conversation

@schroedk
Copy link
Copy Markdown
Collaborator

Summary

Resolves #28. Weights were fused into the storage layer alongside factor levels, which forced every layer above (Design, Operator, preconditioner, solver) to thread an is_unweighted() predicate through hot paths. This PR moves weights out of Store and into the Solver, then separates Design (data + layout) from DesignOperator (linear map).

Three bisectable commits, each compiling and passing the full test suite independently:

  1. Externalize weights from Store — remove weight() / is_unweighted() from the trait, delete ObservationWeights, drop the weights field from FactorMajorStore / ArrayStore. Weights flow as Option<&[f64]> through builders and live owned at the Solver layer as Option<Vec<f64>>. DesignOperator::new(design, weights) builds sqrt_weights once; apply / apply_adjoint match on &self.sqrt_weights so the unweighted hot path stays branch-free. The dead gramian_diagonal method and its five tests are deleted (no production callers).
  2. Mechanical renamesObservationStoreStore, WeightedDesignDesign, WeightedDesignOperatorDesignOperator. No semantic change.
  3. Strip operator methods from Designmatvec_d, rmatvec_dt, rmatvec_wdt, uid_weight and the internal scatter/gather helpers move from domain.rs to operator.rs as module-private functions. Design is now pure data + layout: store, factors, n_rows, n_dofs, from_store, n_factors. Solver::solve's residual diagnostic computes D^T W v as apply_adjoint(weighted_rhs(v)) — bit-identical via the D^T W = D^T W^{1/2} W^{1/2} algebraic identity.

Python API surface unchangedwithin.solve, within.solve_batch, within.Solver all preserve their existing signatures. Breaking changes are confined to the Rust crate (callers who construct Store / Design / operators directly, or implement the Store trait). CHANGELOG updated.

Net diff: 27 files, +671 / −997.

Test plan

  • cargo build --workspace — clean
  • cargo test --workspace — 181 passed, 0 failed
  • cargo test -p within --release -- --ignored --nocapture — 1 passed
  • pixi run develop && pixi run test — 97 passed (Python signatures preserved)
  • cargo bench -p within --bench fixest -- --quick — no regression on the unweighted hot path
  • Targeted weighted-correctness invariants from the plan: test_uniform_weights_matches_unweighted, test_high_level_solve_weighted, test_solve_batch_weighted, prop_weighted_adjoint_property all pass
  • cargo fmt --all + cargo clippy --workspace --all-targets — clean

schroedk added 6 commits May 11, 2026 11:09
Remove weight(obs) and is_unweighted() from the ObservationStore trait,
delete the ObservationWeights enum, and drop the weights field from
FactorMajorStore and ArrayStore. Weights now flow as Option<&[f64]>
through the operator/preconditioner builders and live owned at the
Solver layer as Option<Vec<f64>>.

- WeightedDesignOperator::new(design, weights) takes the borrowed slice
  and computes sqrt_weights once; apply/apply_adjoint match on
  &self.sqrt_weights so the unweighted hot path stays branch-free.
- WeightedDesign::rmatvec_wdt(weights, r, x) and the static
  WeightedDesign::uid_weight(weights, uid) helper replace the prior
  store-driven branches. gramian_diagonal and its five tests are deleted
  as dead code (no production callers).
- build_preconditioner, build_local_domains, CrossTab::build_for_pair*,
  and the accumulate_cross_block family all thread Option<&[f64]>.
- Solver gains a weights: Option<Vec<f64>> field; from_design and
  from_design_with_preconditioner take it. orchestrate::solve /
  solve_batch and the PyO3 bridge build the owned vector from the
  optional numpy weights input. Python signatures unchanged.
…perator→DesignOperator

Mechanical identifier rename across the workspace plus README and module
docstrings. No semantic change; every call site, trait bound, and lib.rs
re-export is updated to the new names.
Design is now pure data + layout: store, factors, n_rows, n_dofs,
from_store, n_factors. The matvec helpers (gather_add, scatter_add,
scatter_add_single_factor, level_from_column_or_store, ScatterStrategy,
PAR_THRESHOLD, SCATTER_LOCAL_THRESHOLD, factor_columns) move into
operator.rs as module-private free functions taking &Design. The
DesignOperator::apply / apply_adjoint methods now call them directly.

- Removed Design::matvec_d, rmatvec_dt, rmatvec_wdt, uid_weight (and the
  internal gather_add/scatter_add methods). The uid_weight read in
  cross_tab.rs inlines to weights.map_or(1.0, |w| w[uid]).
- Solver::solve diagnostic computes D^T W v via apply_adjoint(W^{1/2} v)
  instead of the dedicated rmatvec_wdt method; results are bit-identical
  because D^T W = D^T W^{1/2} W^{1/2} algebraically.
- Tests/examples/benches that called design.matvec_d(...) etc. now use
  DesignOperator::new(&design, None).apply(...) (and likewise for the
  adjoint). The internal tests in src/domain.rs and src/operator/tests.rs
  are pruned to remove redundant matvec coverage that DesignOperator
  already exercises.
…gather

Adopt the cleaner apply/apply_adjoint pattern explored on the closed
refactor/store-design-operator branch:

- Replace gather_add + scatter_add with gather_apply + scatter_apply,
  both taking a closure parameter. gather_apply folds the optional
  W^{1/2} multiply into the LAST factor's pass, so apply does exactly Q
  sweeps over dst regardless of weights (no trailing scale loop).
  scatter_apply computes sw[i] * x[i] inline through a closure, so
  apply_adjoint allocates no per-call scratch and DesignOperator no
  longer holds a Mutex<Vec<f64>>.
- Add length assertion in DesignOperator::new: panics on mismatched
  weights length. Solver entry points keep their fallible
  validate_weights check, so callers going through Solver / solve()
  never trigger the panic.
- ScatterStrategy::pick replaces the ad-hoc match in scatter_add.
- Rename a few stale test/bench labels that still said matvec_d /
  rmatvec_dt: test_matvec_d -> test_apply_unweighted_values;
  test_rmatvec_dt -> test_apply_adjoint_unweighted_values;
  test_large_design_adjoint_property_matvec_d_rmatvec_dt -> drop the
  method-name suffix; matvec_weighted_design bench group ->
  design_operator_apply.
build_preconditioner is a public entry point that takes Option<&[f64]>
weights without going through Solver::from_design. Direct callers could
pass mismatched-length weights and trigger an index-out-of-bounds panic
deep inside accumulate_cross_block. Add validate_weights at the top so
the failure surfaces as the usual WithinError::WeightCountMismatch.

Drop residual matvec_d / rmatvec_dt / rmatvec_wdt references in
tests/domain.rs doc comments and test names.
…er validation

Two behavioral additions from the codex-review follow-ups (4f1410f, 75573ab)
weren't captured in the #28 changelog entries:

- DesignOperator::new now asserts weight length matches design.n_rows,
  the Mutex<Vec<f64>> scratch field is removed, and weighted apply fuses
  the sqrt(W) multiply into the last gather pass.
- build_preconditioner now returns WithinError::WeightCountMismatch
  instead of panicking on out-of-bounds access in CrossTab assembly when
  weights are mis-sized.
@schroedk schroedk merged commit 264586c into main May 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Externalize observation weights; clean store / design / operator split

1 participant