Skip to content

Speed up sparsity-pattern sort with packed integer keys#301

Merged
lxmota merged 1 commit into
mainfrom
fix-sparsity-pattern-sort
May 21, 2026
Merged

Speed up sparsity-pattern sort with packed integer keys#301
lxmota merged 1 commit into
mainfrom
fix-sparsity-pattern-sort

Conversation

@lxmota
Copy link
Copy Markdown
Contributor

@lxmota lxmota commented May 21, 2026

Building the COO->CSC permutation sorted a Vector{Tuple{Int64,Int64}} of all (row, col) index pairs (~92M for a 530k-DOF mesh) with sortperm, which takes an O(N log N) tuple comparison sort. Profiling a Carina torsion run showed this was ~half the runtime of both _update_dofs! and the SparseMatrixPattern constructor (the existing TODO comment already flagged it).

Pack each (i, j) into a single Int64 key, (Int64(i) << 32) | Int64(j), so sortperm takes the integer radix-sort path instead. DOF indices are well below 2^32, so the packed key preserves (i, j) lexicographic order; sortperm being stable, the resulting permutation is identical.

Applied at both sites in SparsityPatterns.jl (SparseMatrixPattern construction and _update_dofs!). On a 530k-DOF torsion problem this cut assembler build 5.0s -> 3.4s and parameter build 7.4s -> 3.9s. Carina test suite: 106/106 pass, all results bit-identical.

Building the COO->CSC permutation sorted a Vector{Tuple{Int64,Int64}} of
all (row, col) index pairs (~92M for a 530k-DOF mesh) with sortperm,
which takes an O(N log N) tuple comparison sort. Profiling a Carina
torsion run showed this was ~half the runtime of both _update_dofs! and
the SparseMatrixPattern constructor (the existing TODO comment already
flagged it).

Pack each (i, j) into a single Int64 key, (Int64(i) << 32) | Int64(j),
so sortperm takes the integer radix-sort path instead. DOF indices are
well below 2^32, so the packed key preserves (i, j) lexicographic order;
sortperm being stable, the resulting permutation is identical.

Applied at both sites in SparsityPatterns.jl (SparseMatrixPattern
construction and _update_dofs!). On a 530k-DOF torsion problem this cut
assembler build 5.0s -> 3.4s and parameter build 7.4s -> 3.9s. Carina
test suite: 106/106 pass, all results bit-identical.
@lxmota lxmota requested a review from cmhamel May 21, 2026 13:01
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.00%. Comparing base (c80965f) to head (7abe9a0).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #301   +/-   ##
=======================================
  Coverage   55.00%   55.00%           
=======================================
  Files          55       55           
  Lines        4801     4801           
=======================================
  Hits         2641     2641           
  Misses       2160     2160           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cmhamel
Copy link
Copy Markdown
Contributor

cmhamel commented May 21, 2026

@lxmota if things passed on the GPU for your carina tests and these tests passed locally feel free to merge.

I'm out of town and my gpu runners are offline right now.

@lxmota
Copy link
Copy Markdown
Contributor Author

lxmota commented May 21, 2026

Merging as Carina's GPU tests pass and at the direction of @cmhamel

@lxmota lxmota merged commit d1fec2d into main May 21, 2026
9 of 13 checks passed
@cmhamel cmhamel deleted the fix-sparsity-pattern-sort branch May 21, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants