feat: introduce memtrace to track the memory usage about lance #5526

Xuanwo · 2025-12-18T10:08:37Z

This PR provides an alternative implementation for memtest. In this version, we introduce memtrace as a new feature in Pylance. Users can enable it via the memtrace feature flag. It offers a similar API to memtest, but it eliminates the need for users to hook or inject dynamic libraries, making it easier to use and test.

Before:

maturin develop
make -C ../memtest build-release
LIB_PATH=$(lance-memtest)
LD_PRELOAD=$LIB_PATH pytest python/ci_benchmarks

After:

maturin develop --features memtrace
pytest python/ci_benchmarks

Parts of this PR were drafted with assistance from Codex (with gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.

wjones127

This approach only captures allocations made in our Rust code, which makes it not work for tests like the insert one in test_memory.py. In that test, we create input data with PyArrow, which won't be captured. The point of that test is to show we don't buffer or collect too much data into memory.

The memtest one is the fourth approach I tried, so I might share the four approaches I tried and why I went with this:

First I tried to implement a solution that uses tracing subscribers to capture allocations in Rust. This would have been cool as it would have worked in Rust unit tests even if there were tests running concurrently. However, each time we called tokio::spawn, we needed to make sure we passed down the span so tracing would continue to capture them. This ended up being too much work.
Next I tried implementing a custom allocator in Rust, similar to this PR, but using it in Rust tests. That was much simpler and caught all allocations. However, it would not work if tests were running concurrently in the same process. There wasn't an easy way to force this to happen in a Rust test. You could always pass cargo test --test-threads=1, but that would be annoying. We could use cargo nextest run, which uses separate processes for each test and is generally faster. But (a) that library crashes on my Mac and (b) new contributors might call cargo test and get confused by the failures.
Next I implemented basically what is in this PR. The idea being I could solve the multi-threading issue by just using Python, which will run only one test a at time in a process by default. However, I found that it wasn't that useful for things like write tests if it didn't capture allocations made outside of our Rust code.
a. I tried to get around the limitation by also using allocation stats from other libraries in Python. PyArrow has some stats on it's global memory pool. But most of those stats can't be reset like our stats, so there wasn't a clear way to combine them.
What I finally settled on was memtest, using the LD_PRELOAD trick with a python library. This captures all allocations reliably and because it's run in Python it doesn't need to worry about concurrency.

That's all the attempts I made I can think of. If you have any new ideas, I'd be glad to hear them.

Xuanwo · 2025-12-19T12:06:14Z

The memtest one is the fourth approach I tried, so I might share the four approaches I tried and why I went with this:

Thank you very much for this! I initially assumed that most of our workload is handled within the Rust core, so the Python part wouldn't require much attention. However, it seems my assumption was incorrect.

Could you elaborate on why we need to consider Python's memory usage as well? From my current understanding, if we're building an online service around lancedb, operations like building the index should be handled server-side in Rust, while users would primarily use Python on the client side.

wjones127 · 2025-12-19T17:23:00Z

Could you elaborate on why we need to consider Python's memory usage as well? From my current understanding, if we're building an online service around lancedb, operations like building the index should be handled server-side in Rust, while users would primarily use Python on the client side.

This is the Lance library, not LanceDB.

Some operations are handled entirely in Rust, like indexing and queries. But for writes, the data comes from outside. Testing that we can stream writes properly is one of the main things we want to test. That's why I gave the example of insert earlier: it tests that we can take a stream of data, and write it out without collecting it all into memory.

Xuanwo · 2025-12-19T20:22:00Z

Thanks! The great answer addressed my questions. I'm now going to close this PR as I do think the current way seems to be the only way.

But for writes, the data comes from outside.

One last question: is it a good idea to measure that lance memory usage didn't grow up during writing? So we can still measure the rust side instead.

Xuanwo · 2026-01-04T10:53:06Z

Let's close.

Xuanwo added 3 commits December 18, 2025 17:51

feat: introduce memtrace

a28afcc

Add ci

eeb95df

Remove memtests

9191312

Xuanwo requested review from westonpace and wjones127 December 18, 2025 10:08

github-actions bot added enhancement New feature or request python labels Dec 18, 2025

Merge branch 'main' into xuanwo/mem-usage-measure

a22de9e

wjones127 reviewed Dec 18, 2025

View reviewed changes

Xuanwo closed this Jan 4, 2026

Xuanwo deleted the xuanwo/mem-usage-measure branch January 4, 2026 10:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: introduce memtrace to track the memory usage about lance #5526

feat: introduce memtrace to track the memory usage about lance #5526

Xuanwo commented Dec 18, 2025 •

edited

Loading

Uh oh!

wjones127 left a comment

Uh oh!

Xuanwo commented Dec 19, 2025

Uh oh!

wjones127 commented Dec 19, 2025

Uh oh!

Xuanwo commented Dec 19, 2025

Uh oh!

Xuanwo commented Jan 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: introduce memtrace to track the memory usage about lance #5526

feat: introduce memtrace to track the memory usage about lance #5526

Conversation

Xuanwo commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjones127 left a comment

Choose a reason for hiding this comment

Uh oh!

Xuanwo commented Dec 19, 2025

Uh oh!

wjones127 commented Dec 19, 2025

Uh oh!

Xuanwo commented Dec 19, 2025

Uh oh!

Xuanwo commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Xuanwo commented Dec 18, 2025 •

edited

Loading

Xuanwo commented Jan 4, 2026 •

edited

Loading