Skip to content

[BUG] Vulnerability in index deserialization #2042

@cjnolet

Description

@cjnolet

cuVS Index Deserialization: Integer Overflow, Type Confusion, and Allocation Bombs Across All Index Types

Description
Seven execution-verified bugs across cuVS index serialization/deserialization code. (1) Heap-buffer-overflow in buffered_ofstream::write when size exceeds buffer capacity. (2) Missing break in switch causes fallthrough from kSerializeStridedDataset to kSerializeVPQDataset, causing type confusion on invalid cudaDataType_t. (3) Integer overflow in CAGRA deserialize: n_rows * graph_degree product overflows, causing small allocation followed by large read — affects ALL index deserializers (brute_force, IVF-Flat, IVF-PQ, CAGRA). (4) Enum values deserialized without validation — invalid DistanceType/codebook_gen/list_layout values are UB in C++. (5) OOM via n_lists=0xFFFFFFFF in IVF-PQ/IVF-Flat deserialize. (6) rowsdimsizeof(T) overflow plus size_t-to-int64_t narrowing in brute_force deserialize. All confirmed by dynamic analysis with PoC files.

Exploit Scenario
A vector search service loads user-uploaded index files. An attacker crafts an index with n_rows=0xFFFFFFFFFFFFFFFF and dim=2. The multiplication overflows to a small value, allocating a tiny buffer. The subsequent stream read writes the full (large) payload into the undersized buffer, achieving heap corruption and potentially arbitrary code execution.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions