Add matrix-free mode to SparseMatrixAssembler; drop dead H/D buffers#295
Open
Add matrix-free mode to SparseMatrixAssembler; drop dead H/D buffers#295
Conversation
The default path is unchanged. Two changes reduce memory: 1. Drop `damping_storage` and `hessian_storage` from the struct. No code in this repository wrote either; both sat preallocated as ~700 MB each on a 530 k-DOF mesh. `hessian(asm)` now aliases `stiffness(asm)`; if a future caller needs `H ≠ K`, add the field back and specialize. 2. Add `matrix_free::Bool=false` to `SparseMatrixAssembler(dof; ...)`. When true, the sparse pattern is constructed empty and the mass / stiffness value buffers are zero-length, saving the full ~7 GB of matrix-side preallocation that an integrator like central difference never touches. `assemble_matrix!`/`assemble_mass!`/`assemble_stiffness!` error with a clear message on a matrix-free assembler; `mass(asm)`/`stiffness(asm)` return a zero sparse matrix of the correct shape. GPU paths use the default `matrix_free=false` and are bit-for-bit unchanged. `update_dofs!` now skips the matrix-pattern rebuild on matrix-free assemblers; without the guard, `_update_dofs!` would silently reconstruct the pattern from scratch, flipping the assembler back into the matrix-bearing mode. All 18065 FEC tests pass.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #295 +/- ##
==========================================
- Coverage 66.77% 66.63% -0.14%
==========================================
Files 55 55
Lines 4758 4888 +130
==========================================
+ Hits 3177 3257 +80
- Misses 1581 1631 +50 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
damping_storageandhessian_storagefromSparseMatrixAssembler— both were preallocated but never written.hessian(asm)now aliasesstiffness(asm).matrix_free::Bool=falsetoSparseMatrixAssembler(dof; ...). When true, the sparse pattern is empty and the mass/stiffness value buffers are zero-length.assemble_matrix!/assemble_mass!/assemble_stiffness!error with a clear message;mass(asm)/stiffness(asm)return a zero sparse matrix of the right shape.update_dofs!skips the matrix-pattern rebuild on matrix-free assemblers; without the guard_update_dofs!would silently flip the assembler back into matrix-bearing mode.Why
For a downstream solver (Carina.jl) running an explicit central-difference integrator on a 530 k-DOF mesh, profiling showed the assembler preallocating ~7.4 GB it never touches:
matrix_patterndamping_storagehessian_storagemass_storagestiffness_storagevector_patternresidual_storageAfter this change a matrix-free assembler costs ~few MB instead. Implicit/Newton paths keep the existing eager allocation by default and recover ~1.4 GB from the dropped damping/hessian fields.
GPU compatibility
Default
matrix_free=falseis bit-for-bit unchanged — every existing GPU path stays on that default and was not touched. The matrix-free branch only allocates empty CPU vectors at construction and survivesAdapt.adapt_structurecleanly (emptyCuArray/ROCArrayafter adapt).API change
hessian(asm)previously returned a sparse matrix backed by an independenthessian_storage. After this PR it returns whateverstiffness(asm)returns. No code in this repository writes a separate Hessian; if a future caller needsH ≠ K, add the field back and specialize the accessor.