Make block_size flexible #235

razdoburdin · 2025-12-01T12:49:24Z

This small PR adds an ability to set block_size value while building index. If parameter isn't set the default value of 2^30 is used.

ahuber21

Great! This should do what we want it to do. I'd refine the tests a bit and maybe improve the syntax. We can brainstorm together what the cleanest implementation would look like.

ahuber21 · 2025-12-01T12:56:44Z

bindings/cpp/src/dynamic_vamana_index_impl.h

    ) {
        auto threadpool = default_threadpool();

+        auto blocksize_bytes = svs::lib::PowerOfTwo(blocksize_exp);


Since this is a runtime value, we'd probably need some validation. Do you think it could be a good idea to create a struct similar to PowerOfTwo and use that as function arg rather than the plain int. We could perform some boundary checks in the constructor.

Yes, that make sense.

What values of exponents are acceptable? In the best of my understanding, blocksize_bytes should be in range [4KB, 1GB] (based on possible page sizes

ScalableVectorSearch/include/svs/core/allocator.h

Line 95 in d88aec8

HugepageX86Parameters{1 << 30, MAP_HUGETLB | MAP_HUGE_1GB},

) , right?

Yes, that sounds reasonable. @mihaic did you ever experiment with even bigger hugepages?

Since Xeon only goes to 1 GiB, we can have that as the limit.

ahuber21 · 2025-12-01T13:00:26Z

bindings/cpp/tests/runtime_test.cpp

+    std::vector<size_t> labels(test_n);
+    std::iota(labels.begin(), labels.end(), 0);
+
+    int block_size_exp = 17; // block_size = 2^block_size_exp


Beyond testing if passing the param works, we should also test if the block size actually changed.

ahuber21 · 2025-12-01T13:01:44Z

bindings/cpp/include/svs/runtime/dynamic_vamana_index.h

 // Abstract interface for Dynamic Vamana-based indexes.
 struct SVS_RUNTIME_API DynamicVamanaIndex : public VamanaIndex {
-    virtual Status add(size_t n, const size_t* labels, const float* x) noexcept = 0;
+    virtual Status add(size_t n, const size_t* labels, const float* x, int blocksize_exp = 30) noexcept = 0;


I'm wondering if there's a cleaner way to communicate this value to init_impl(). Maybe @rfsaliev has an opinion?

rfsaliev

Thank you for the PR, there is some functionality to be implemented.

rfsaliev · 2025-12-01T13:07:30Z

bindings/cpp/include/svs/runtime/dynamic_vamana_index.h

 struct SVS_RUNTIME_API DynamicVamanaIndex : public VamanaIndex {
-    virtual Status add(size_t n, const size_t* labels, const float* x) noexcept = 0;
+    virtual Status
+    add(size_t n, const size_t* labels, const float* x, int blocksize_exp = 30


Seems like this approach manages block size for an index created from scratch.
How will we manage block size for an index loaded from a stream?

rfsaliev · 2025-12-01T13:13:19Z

bindings/cpp/src/svs_runtime_utils.h

+    static StorageType init(
+        const svs::data::ConstSimpleDataView<float>& data,
+        Pool& pool,
+        svs::lib::PowerOfTwo SVS_UNUSED(blocksize_bytes)


SQDataset<>::compress() accepts customized allocator via allocator argument...

rfsaliev · 2025-12-01T13:15:08Z

bindings/cpp/src/svs_runtime_utils.h

+    static StorageType init(
+        const svs::data::ConstSimpleDataView<float>& data,
+        Pool& pool,
+        svs::lib::PowerOfTwo SVS_UNUSED(blocksize_bytes)


LVQDataset<>::compress() accepts customized allocator via allocator argument...

rfsaliev · 2025-12-01T13:16:28Z

bindings/cpp/src/svs_runtime_utils.h

    static StorageType init(
        const svs::data::ConstSimpleDataView<float>& data,
        Pool& pool,
+        svs::lib::PowerOfTwo SVS_UNUSED(blocksize_bytes),


LeanDataset<>::compress() accepts customized allocator via allocator argument...

make block_size flexible

dcdfb71

razdoburdin requested review from ahuber21, ethanglaser and ibhati as code owners December 1, 2025 12:49

linting

a1766d6

ahuber21 requested changes Dec 1, 2025

View reviewed changes

rfsaliev requested changes Dec 1, 2025

View reviewed changes

Make block_size flexible #235

Are you sure you want to change the base?

Make block_size flexible #235

Uh oh!

Conversation

razdoburdin commented Dec 1, 2025

Uh oh!

ahuber21 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

rfsaliev Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rfsaliev Dec 1, 2025 •

edited

Loading