fix: Optimize GetRandomChain() in DenseSet #6033

vyavdoshenko · 2025-11-10T19:17:59Z

Relevant to: #6016

Improves SPOP performance on sparse hash tables (created by Reserve()) through three key optimizations:

thread_local RNG: Use thread_local absl::InsecureBitGen instead of
creating a new BitGen on each call, eliminating construction overhead
Iterator increment: Replace modulo operation with iterator increment
and wrap-around check, avoiding expensive division in the hot loop
IsEmpty() early check: Test IsEmpty() before calling ExpireIfNeeded()
to skip expensive expiration checks on empty buckets

Comparison:
Main branch:

BM_Spop1000/sparseness:1                    5964193 ns      5963687 ns          117
BM_Spop1000/sparseness:4                    6365861 ns      6365321 ns          110
BM_Spop1000/sparseness:10                   7633700 ns      7633658 ns           91
BM_Spop1000/sparseness:40                  13036086 ns     13035934 ns           54
BM_Spop1000/sparseness:100                 20192890 ns     20192937 ns           35

Fixed version (current):

BM_Spop1000/sparseness:1                    1232039 ns      1232019 ns          568
BM_Spop1000/sparseness:4                    1590988 ns      1590901 ns          440
BM_Spop1000/sparseness:10                   2773856 ns      2773784 ns          252
BM_Spop1000/sparseness:40                   7754844 ns      7754870 ns           90
BM_Spop1000/sparseness:100                 14499281 ns     14499131 ns           48

Previous approach (wrong direction) (Rejection Sampling):

BM_Spop1000/sparseness:1                    1351481 ns      1351497 ns          518
BM_Spop1000/sparseness:4                    2342621 ns      2342612 ns          299
BM_Spop1000/sparseness:10                   5463768 ns      5463703 ns          129
BM_Spop1000/sparseness:40                  13164771 ns     13164710 ns           53
BM_Spop1000/sparseness:100                 21711212 ns     21711052 ns           32

The current comparison is based on fixed benchmark: #6038

dranikpg

Clever. Please change the bit gen in GetRandomIterator as well

dranikpg · 2025-11-10T19:56:44Z

src/core/dense_set.cc

+  // Use thread-local generator to avoid repeated construction overhead
+  thread_local absl::BitGen gen;


Maybe just using absl::InsecureBitGen is as cheap?

BorysTheDev · 2025-11-10T20:19:07Z

"For sparse tables (6% util): ~87% success rate with 32 probes -> avoids expensive linear scan"
sounds really strange, why linear scan is more expensive than probing

romange · 2025-11-11T04:02:08Z

Please open a separate issue implementing DenseSet::Shrink, as it is needed anyway for sorted sets, sets and hash maps

romange · 2025-11-11T04:20:28Z

Clever indeed.
$success = 1 - fail = 1- (sparsity)^{32}$

BorysTheDev · 2025-11-11T06:58:18Z

If we have a uniform distribution and if we start from random point, we should get the same complexity for linear scan as for probing, but linear scan should be faster because of memory spatial locality

BorysTheDev

random probing can't solve this issue

dranikpg · 2025-11-11T07:41:10Z

random probing can't solve this issue

With string hashes the distribution should be close to uniform. If to think about just theoretically for a single operation, probing one more cell with a linear scan shouldn't be different from doing random probes. Except that if we always probe linarly, then we create "islands" of values, as there is more blank space to land in and we'll only bite off the next leftmost value of the next "island". So maybe with linear probing a large hole is created that only grows on average... At least this is my theory.

Any full solution is just shrinking the backing array, as there's just not enough information for efficiently searching a few leftover entries in a large backing

BorysTheDev · 2025-11-11T07:53:26Z

random probing can't solve this issue

With string hashes the distribution should be close to uniform. If to think about just theoretically for a single operation, probing one more cell with a linear scan shouldn't be different from doing random probes. Except that if we always probe linarly, then we create "islands" of values, as there is more blank space to land in and we'll only bite off the next leftmost value of the next "island". So maybe with linear probing a large hole is created that only grows on average... At least this is my theory.

Any full solution is just shrinking the backing array, as there's just not enough information for efficiently searching a few leftover entries in a large backing

If we start linear scan from random point every time, there shouldn't be any islands and result should be the same as random probing. I think the problem is somewhere else, for example, our random generator could provide a non-uniform distribution or somehow we don't remove items in the beginning or something else. Anyway without understanding the real problem, mathematically this solution is incorrect

romange · 2025-11-11T09:12:43Z

I think having 5000x improvement justifies the fix. Having said that, I agree that it would be nice to understand WHY the main branch is "5000x" slower.

BorysTheDev · 2025-11-11T09:16:50Z

I think having 5000x improvement justifies the fix. Having said that, I agree that it would be nice to understand WHY the main branch is "5000x" slower.

I don't say that there is no improvement. I say that the current fix isn't optimal and uses incorrect assumptions. And the results that we see have another explanation. I think the problem was fixed with some random change, but not with probing.

mkaruza · 2025-11-11T10:12:18Z

Outside of discussion about randomness.

@vyavdoshenko did you consider also rewriting function to use bit operations instead of modulo. i.e.

  size_t modulo = (1 << capacity_log_) - 1;
  for (size_t i = offset; i < entries_.size() + offset; i++) {
    auto it = entries_.begin() + (i & modulo);

vyavdoshenko · 2025-11-11T13:42:01Z

The simpliest solution is the best.
There were 3 problems:

Creating random generator for each call
Using modulo operation, it can be avoided
ExpireIfNeeded executed for empty objects, it can be avoided

mkaruza · 2025-11-11T13:49:06Z

src/core/string_set_test.cc

    state.ResumeTiming();
    for (int i = 0; i < 1000; ++i) {
-      src.Pop();
+      tmp.Pop();


Please check that we always pop 1000 elements or that we always pop element - whatever is easier. Otherwise LGTM.

romange · 2025-11-11T15:57:16Z

src/core/dense_set.cc

  }
+
+  // Use thread-local generator to avoid construction overhead
+  thread_local absl::InsecureBitGen gen;


let's use the same thread_local object here and at line 710

romange · 2025-11-11T17:38:27Z

I remember seeing differrent numbers for the benchmark. While this improvement is good, it does not solve the problem for over sized sets.

dranikpg · 2025-11-11T17:55:52Z

src/core/dense_set.cc

 constexpr size_t kMinSize = 1 << kMinSizeShift;
 constexpr bool kAllowDisplacements = true;

+thread_local absl::InsecureBitGen tl_bit_gen;


Maybe creating the bitgen is not expensive

fix: Update dense_set

9d378f4

vyavdoshenko requested review from BorysTheDev, dranikpg and romange November 10, 2025 19:17

vyavdoshenko self-assigned this Nov 10, 2025

dranikpg previously approved these changes Nov 10, 2025

View reviewed changes

romange previously approved these changes Nov 11, 2025

View reviewed changes

BorysTheDev requested changes Nov 11, 2025

View reviewed changes

vyavdoshenko dismissed stale reviews from romange and dranikpg via bd49627 November 11, 2025 13:33

vyavdoshenko requested review from BorysTheDev, dranikpg, mkaruza and romange November 11, 2025 13:42

mkaruza previously approved these changes Nov 11, 2025

View reviewed changes

vyavdoshenko marked this pull request as draft November 11, 2025 13:59

vyavdoshenko changed the title ~~fix: Optimize GetRandomChain() in DenseSet~~ [Do Not Review]fix: Optimize GetRandomChain() in DenseSet Nov 11, 2025

vyavdoshenko added 2 commits November 11, 2025 17:24

Merge branch 'main' into bobik/dense_set_opt

df11a06

fix: review comments

ed064fc

vyavdoshenko dismissed mkaruza’s stale review via ed064fc November 11, 2025 15:37

vyavdoshenko force-pushed the bobik/dense_set_opt branch from bd49627 to ed064fc Compare November 11, 2025 15:37

vyavdoshenko requested a review from mkaruza November 11, 2025 15:43

vyavdoshenko changed the title ~~[Do Not Review]fix: Optimize GetRandomChain() in DenseSet~~ fix: Optimize GetRandomChain() in DenseSet Nov 11, 2025

vyavdoshenko marked this pull request as ready for review November 11, 2025 15:43

romange reviewed Nov 11, 2025

View reviewed changes

fix: review comments

bed7209

vyavdoshenko requested a review from romange November 11, 2025 16:44

romange approved these changes Nov 11, 2025

View reviewed changes

dranikpg approved these changes Nov 11, 2025

View reviewed changes

BorysTheDev approved these changes Nov 11, 2025

View reviewed changes

vyavdoshenko merged commit fa0f64f into main Nov 11, 2025
10 checks passed

vyavdoshenko deleted the bobik/dense_set_opt branch November 11, 2025 19:34

		// Use thread-local generator to avoid repeated construction overhead
		thread_local absl::BitGen gen;

fix: Optimize GetRandomChain() in DenseSet #6033

fix: Optimize GetRandomChain() in DenseSet #6033

Conversation

vyavdoshenko commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dranikpg left a comment

Choose a reason for hiding this comment

Uh oh!

dranikpg Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

vyavdoshenko Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

BorysTheDev commented Nov 10, 2025

Uh oh!

romange commented Nov 11, 2025

Uh oh!

romange commented Nov 11, 2025

Uh oh!

BorysTheDev commented Nov 11, 2025

Uh oh!

BorysTheDev left a comment

Choose a reason for hiding this comment

Uh oh!

dranikpg commented Nov 11, 2025

Uh oh!

BorysTheDev commented Nov 11, 2025

Uh oh!

romange commented Nov 11, 2025

Uh oh!

BorysTheDev commented Nov 11, 2025

Uh oh!

mkaruza commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vyavdoshenko commented Nov 11, 2025

Uh oh!

mkaruza Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romange Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

vyavdoshenko Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

romange commented Nov 11, 2025

Uh oh!

dranikpg Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vyavdoshenko commented Nov 10, 2025 •

edited

Loading

mkaruza commented Nov 11, 2025 •

edited

Loading

mkaruza Nov 11, 2025 •

edited

Loading