fix: sync upstream + revive AdaBoost & ResidualChain (Object=M pattern, closes #3) by jagan-nuvai · Pull Request #4 · Nuvai/linfa

jagan-nuvai · 2026-05-01T04:21:01Z

Summary

Two concerns bundled (originally just the upstream sync, then expanded with the structural fix):

Sync with upstream rust-ml/linfa master (7 commits since 0.8.1)
Revive AdaBoost and ResidualChain by reworking their trait bounds with the Object = M pattern, eliminating the trait-solver recursion under ndarray 0.17 (closes linfa-ensemble: AdaBoost trait bounds need structural rework (orphaned in #2) #3)

If preferred, I can split into two separate PRs (sync-only first, then rework). Bundled here for atomic review since the rework un-orphans what the sync orphaned.

Part 1 — Upstream sync (commit `47c7ed1`)

7 incoming commits from rust-ml/linfa master:

Commit	Description
`1abc88f`	feat: add symmetric mean absolute percentage error (sMAPE) (rust-ml#437)
`12c6c73`	fix: realign PreprocessingError variants with error strings (rust-ml#434)
`17f8696`	Fix label ordering in binary logistic regression (rust-ml#432)
`c7c2af5`	Add generic ResidualChain composing method (rust-ml#430)
`b1f9ddb`	Relax required test score
`2197362`	Update to Zola 0.22
`1de164b`	feat(linfa-tsne): update to bhtsne 0.5.4 (rust-ml#429)

4 conflicts resolved (3 Cargo.toml regions + 1 lib.rs autotraits test that took upstream's L2Dist API change).

Part 2 — AdaBoost + ResidualChain rework (commit `2250107`)

Both wrappers used where P: Fit<...> + Clone plus a direct P::Object: SomeTrait bound, which under ndarray 0.17 cascaded into P: ParamGuard not satisfied → infinite <<P::Checked>::Checked>::Checked: ParamGuard regress.

The proper fix mirrors the working EnsembleLearnerValidParams template (linfa-ensemble/src/algorithm.rs:137): introduce a fresh generic M for the inner model type and bind via Fit<..., Object = M>. This decouples the trait solver's job into independent linear obligations:

"P implements Fit with Object = M" (verified by where-clause assertion)
"M implements " (independent constraint)

Instead of the chained projection that forces blanket-impl resolution.

AdaBoost (algorithms/linfa-ensemble/src/adaboost.rs):

impl<D, T, P, M, R> Fit<Array2<D>, T, Error> for AdaBoostValidParams<P, R>
where
    D: Clone + ndarray::ScalarOperand,
    T: FromTargetArrayOwned + AsTargets,
    T::Elem: Copy + Eq + Hash + std::fmt::Debug + Into<usize>,
    T::Owned: AsTargets<Elem = <T as AsTargets>::Elem>,
    P: Fit<Array2<D>, T::Owned, Error, Object = M>,
    M: PredictInplace<Array2<D>, T::Owned>,
    R: Rng + Clone,
    usize: Into<T::Elem>,

ResidualChain (src/composing/residual_chain.rs):

impl<F1, F2, M1, M2, F: Float, D: Data<Elem = F> + RawDataClone, T, E1, E2>
    Fit<Arr2<D>, T, ResidualChainError<E1, E2>> for ResidualChain<F1, F2, F>
where
    Arr2<D>: Records,
    F1: Fit<Arr2<D>, T, E1, Object = M1>,
    F2: Fit<Arr2<D>, Array1<F>, E2, Object = M2>,
    for<'a> M1: Predict<&'a Arr2<D>, Array1<F>>,
    T: AsTargets<Elem = F, Ix = Ix1>,
    E1: std::error::Error + From<crate::error::Error>,
    E2: std::error::Error + From<crate::error::Error>,

No public API change: users still write AdaBoostParams::new(DecisionTree::params()).fit(&train) and params1.chain(params2).fit(&train) exactly as before. The M generic is inferred from the call site.

Sources/tests/example restored

algorithms/linfa-ensemble/src/lib.rs — mod adaboost; re-added, 6 AdaBoost test fns un-gated from #[cfg(any())], doc comment AdaBoost section restored.
algorithms/linfa-ensemble/examples/adaboost_iris.rs — restored from upstream master (134 lines, no migration needed).
src/composing/mod.rs — pub mod residual_chain; re-added, doc comment "four composition models" restored.

Verification on x86_64-linux (knuckles-dev VM)

Check	Result
`cargo check --workspace --all-targets`	exit 0, 0 warnings
`cargo test -p linfa-ensemble --lib`	16 passed (6 AdaBoost tests now active alongside 10 existing)
`cargo test --workspace`	running, will update once complete

Notes for reviewer

Issue linfa-ensemble: AdaBoost trait bounds need structural rework (orphaned in #2) #3 closes: both AdaBoost and ResidualChain are revived with the rework. The Object = M pattern is what actually works — the failed alternatives explored are documented in commit 2250107's message body for posterity.
Pattern worth documenting upstream (rust-ml/linfa): wrappers that take generic Fit-able params should bind Fit<..., Object = M> rather than P::Object: .... Saves the next contributor from the same trap. Optional follow-up: contribute a CONTRIBUTING.md note or wrap-pattern docs.
Single bundled PR is offered for atomicity (the rework's diff makes no sense without the sync's orphan being present). Splitting is doable but adds chained-PR complexity.

Closes #3

* feat(tsne): update to bhtsne 0.5.4 the bhtsne crate was fairly outdated, this new version is parallelized and allows using a custom distance metric (hence the added dependency on linfa-nn). That said, the version does not let you set the RNG used to initialize sampling (hence the removal of rng related apis). * test(tsne): add test for iris dataset separation using bharnes-hut should resolve that code coverage thing

New test was added following bhtsne bump (which cannot now be seeded). Relaxed tolerance to make it more robust in CI

* feat: add linfa-residual-sequence crate Implements ResidualSequence Struct and StackWith trait for composing regression models in a boosting / residual-stacking pattern. The second (and any further) model trains on the residuals left by the previous one; predictions are summed. Docs and tests were written with AI assistance. * remove doc link * move to composing/ module in linfa main crate * update docs * implement PredictInplace instead * remove unused param error * use one struct to implement stacking * add deep chain test * Rename to ResidualChain Implement Shrinkage implement paramguard for shrinkage * satisfy zola * can only shrink by if target has the same float type * work with predict inplace only * zola fix * add link in docs * simplify comparison * rename to residual_chain as consistent with struct * implement copy trait * add method `chain` which just chains self with an unshrunk corrector. rename stack_with -> chain_shrunk * add doc

…-ml#432) it now assings positive/negative labels based on Ord ordering of the label values rather than encouter order in the training dat

)

* feat: add smape metric * refactor: update sMAPE to Adjusted version and add multi target test Updated sMAPE formula to Adjusted (Makridakis, 1993) version Changed output scaling to [0, 200] Added test_symmetric_mean_absolute_percentage_error_multi_target * refactor: use F::epsilon instead of f64 cast Co-authored-by: Rémi Lafage <remi.lafage@onera.fr> --------- Co-authored-by: Rémi Lafage <remi.lafage@onera.fr>

Brings 7 upstream commits since the 0.8.1 release: - 1abc88f feat: add symmetric mean absolute percentage error (sMAPE) (rust-ml#437) - 12c6c73 fix: realign PreprocessingError variants with error strings (rust-ml#434) - 17f8696 Fixes rust-ml#393. label ordering in binary logistic regression (rust-ml#432) - c7c2af5 Add generic ResidualChain composing method (rust-ml#430) - b1f9ddb Relax required test score - 2197362 Update to Zola 0.22 (docs tooling) - 1de164b feat(linfa-tsne): update to bhtsne 0.5.4 (rust-ml#429) Conflict resolution (4 sites in 3 files): - Cargo.toml dev-dependencies: kept our git-pinned statrs, added upstream's new linfa-linear and linfa-svm path entries. - linfa-tsne/Cargo.toml [dependencies]: ndarray 0.17 (ours) + bhtsne 0.5.4 (upstream's bump). Dropped ndarray-rand and pdqselect from main deps as upstream did (ndarray-rand moved to dev-deps; pdqselect was for old bhtsne). - linfa-tsne/Cargo.toml [dev-dependencies]: rand 0.9 (ours), ndarray-rand 0.16 (added per upstream's structural move; bumped to rand-0.9-compat version). - linfa-tsne/src/lib.rs autotraits test: took upstream's L2Dist API change. Upstream's rust-ml#430-era refactor changed TSneParams' second type param from a Uniform distribution to a Distance metric. Our rand 0.9 migration of the old API path is obsoleted; upstream's new shape is correct. ResidualChain orphaned pending follow-up - Removed `pub mod residual_chain;` and its export from src/composing/mod.rs. - Source file src/composing/residual_chain.rs left on disk for revival. - Reason: Same trait-recursion class as AdaBoost (issue #3). The `where F1: Fit + ParamGuard` shape on a generic-over-Fit-able-types wrapper triggers infinite trait-solver demands under ndarray 0.17 because Linfa's Fit blanket impl keeps demanding `<F1::Checked>::Checked: ParamGuard` recursively. Verified on x86_64-linux (knuckles-dev VM): cargo check --workspace --all-targets -> exit 0, 0 warnings cargo test --workspace -> 471 passed, 0 failed

…loses #3) Both AdaBoost (linfa-ensemble) and ResidualChain (linfa core's composing module) wrap a generic Fit-able parameter type and call methods on the resulting model. Their original where-clauses (`P: Fit<...> + Clone` plus a direct `P::Object: SomeTrait<...>` constraint) triggered an infinite trait- solver recursion under ndarray 0.17: required for `P` to implement `Fit<...>` -> `P: ParamGuard not satisfied` (via Linfa's ParamGuard blanket impl) -> `<P::Checked>: ParamGuard not satisfied` -> `<<P::Checked>::Checked>: ParamGuard not satisfied` -> ... ad infinitum PR #2 and PR #4 worked around the issue by orphaning these features (removing their `mod` declarations + gating their tests). This commit properly fixes both with a single template that mirrors the working EnsembleLearnerValidParams pattern in algorithms/linfa-ensemble/src/algorithm.rs: Pattern: introduce a fresh generic `M` (and `M2` where two inner models exist) for each inner-model type, and bind the associated `Object` type at the trait bound via `Fit<..., Object = M>`. This decouples the chase: the solver verifies "P implements Fit with Object=M" and "M implements <model trait>" as two independent linear obligations, instead of the recursive projection P::Object that forces blanket-impl resolution. linfa-ensemble/src/adaboost.rs - AdaBoost rework - Drop the failed `P: ParamGuard + Clone, <P::Checked>::Checked: ...` chain. - New impl signature: `impl<D, T, P, M, R> Fit<Array2<D>, T, Error> for AdaBoostValidParams<P, R>` with `P: Fit<Array2<D>, T::Owned, Error, Object = M>` and `M: PredictInplace<Array2<D>, T::Owned>`. - Use `T::Owned` (not `T`) for the inner Fit target — matches EnsembleLearner's pattern; the projection lets the solver short-circuit before hitting the blanket route. - `type Object = AdaBoost<M, T::Elem>` (was `AdaBoost<P::Object, T::Elem>`). - Drop now-redundant `+ ParamGuard` import. src/composing/residual_chain.rs - ResidualChain rework - New impl signature: `impl<F1, F2, M1, M2, F, D, T, E1, E2> Fit<...> for ResidualChain<F1, F2, F>` with `F1: Fit<..., Object = M1>` and `F2: Fit<..., Object = M2>`. T::Owned isn't needed here because T's pre-existing `AsTargets<Elem = F, Ix = Ix1>` already pins enough shape. - `for<'a> M1: Predict<&'a Arr2<D>, Array1<F>>` (was `F1::Object: ...`). - `type Object = ResidualChain<M1, M2, F>` (was `ResidualChain<F1::Object, F2::Object, F>`). Re-add the orphans: - linfa-ensemble/src/lib.rs: restore `mod adaboost;` + `mod adaboost_hyperparams;` + `pub use ...`, restore AdaBoost section in module-level docs, un-gate the 6 AdaBoost test fns from `#[cfg(any())]`. - linfa-ensemble/examples/adaboost_iris.rs: restore from upstream master (134 lines, no migration needed beyond what's already in tree). - src/composing/mod.rs: restore `pub mod residual_chain;`, restore the ResidualChain bullet in module-level docs. Public API unchanged - users still write `AdaBoostParams::new(DecisionTree::params()).fit(&train)` and `params1.chain(params2).fit(&train)` exactly as before. The model-type generic `M` is inferred at the call site. Verified on x86_64-linux (knuckles-dev VM): cargo check --workspace --all-targets -> exit 0, 0 warnings cargo test -p linfa-ensemble --lib -> 16 passed (6 AdaBoost tests now active)

jagan-nuvai · 2026-05-01T05:24:19Z

Final test verification — full workspace

cargo test --workspace on x86_64-linux (knuckles-dev VM): 509 passed, 0 failed (511s wall time).

Net delta vs. pre-rework state:

+6 AdaBoost tests (un-gated from #[cfg(any())], now active)
+~32 ResidualChain tests (pub mod residual_chain; re-added, all passing)
Total: +38 tests restored to the suite

The Object = M rework is fully validated:

Compile-time: cargo check --workspace --all-targets zero warnings
Run-time: all six AdaBoost iris/synthetic tests + all ResidualChain doctests/integration tests + every other test in the workspace passes

Ready for review/merge.

Signals our fork's divergence from upstream rust-ml/linfa 0.8.1: - ndarray 0.16 -> 0.17 (foundational API surface change) - rand 0.8 -> 0.9 (bumped because ndarray-rand pinned the major) - ndarray-stats 0.6 -> 0.7, ndarray-linalg 0.17 -> 0.18, ndarray-npy 0.9 -> 0.10 - statrs git pin (rand 0.9 compat), sprs 0.11.4, rand_xoshiro 0.7 - Forks of linfa-linalg + argmin pulled into git source pins - Plus bug fixes: linfa-kernel infinite recursion (PR #2), AdaBoost + ResidualChain trait-bound rework (PR #4) — public API unchanged Bumped 73 version sites across 18 Cargo.toml files (workspace member crate versions + path-dependency version specs). The -nuvai.1 prefix marks this as the first release in the Nuvai-patched 0.9.x line; subsequent maintenance releases will be -nuvai.2, etc. Distinguishes from any future upstream 0.9.0. Verified: cargo check --workspace -> exit 0

AnthonyMichaelTDM and others added 8 commits February 3, 2026 22:07

Update to Zola 0.22

2197362

Relax required test score

b1f9ddb

New test was added following bhtsne bump (which cannot now be seeded). Relaxed tolerance to make it more robust in CI

Fixes rust-ml#393. label ordering in binary logistic regression,(rust…

17f8696

…-ml#432) it now assings positive/negative labels based on Ord ordering of the label values rather than encouter order in the training dat

fix: realign PreprocessingError variants with error strings (rust-ml#434

12c6c73

)

jagan-nuvai mentioned this pull request May 1, 2026

linfa-ensemble: AdaBoost trait bounds need structural rework (orphaned in #2) #3

Closed

5 tasks

jagan-nuvai changed the title ~~chore: sync with upstream rust-ml/linfa (7 commits since 0.8.1)~~ fix: sync upstream + revive AdaBoost & ResidualChain (Object=M pattern, closes #3) May 1, 2026

jagan-nuvai merged commit fad1e4a into master May 1, 2026
33 of 36 checks passed

jagan-nuvai deleted the chore/sync-upstream-2026-05-01 branch May 1, 2026 05:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: sync upstream + revive AdaBoost & ResidualChain (Object=M pattern, closes #3)#4

fix: sync upstream + revive AdaBoost & ResidualChain (Object=M pattern, closes #3)#4
jagan-nuvai merged 9 commits intomasterfrom
chore/sync-upstream-2026-05-01

jagan-nuvai commented May 1, 2026 •

edited

Loading

Uh oh!

jagan-nuvai commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

jagan-nuvai commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Part 1 — Upstream sync (commit 47c7ed1)

Part 2 — AdaBoost + ResidualChain rework (commit 2250107)

Sources/tests/example restored

Verification on x86_64-linux (knuckles-dev VM)

Notes for reviewer

Uh oh!

jagan-nuvai commented May 1, 2026

Final test verification — full workspace

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jagan-nuvai commented May 1, 2026 •

edited

Loading

Part 1 — Upstream sync (commit `47c7ed1`)

Part 2 — AdaBoost + ResidualChain rework (commit `2250107`)