-
Notifications
You must be signed in to change notification settings - Fork 510
feat: introduce memtrace to track the memory usage about lance #5526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wjones127
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach only captures allocations made in our Rust code, which makes it not work for tests like the insert one in test_memory.py. In that test, we create input data with PyArrow, which won't be captured. The point of that test is to show we don't buffer or collect too much data into memory.
The memtest one is the fourth approach I tried, so I might share the four approaches I tried and why I went with this:
- First I tried to implement a solution that uses tracing subscribers to capture allocations in Rust. This would have been cool as it would have worked in Rust unit tests even if there were tests running concurrently. However, each time we called
tokio::spawn, we needed to make sure we passed down the span so tracing would continue to capture them. This ended up being too much work. - Next I tried implementing a custom allocator in Rust, similar to this PR, but using it in Rust tests. That was much simpler and caught all allocations. However, it would not work if tests were running concurrently in the same process. There wasn't an easy way to force this to happen in a Rust test. You could always pass
cargo test --test-threads=1, but that would be annoying. We could usecargo nextest run, which uses separate processes for each test and is generally faster. But (a) that library crashes on my Mac and (b) new contributors might callcargo testand get confused by the failures. - Next I implemented basically what is in this PR. The idea being I could solve the multi-threading issue by just using Python, which will run only one test a at time in a process by default. However, I found that it wasn't that useful for things like write tests if it didn't capture allocations made outside of our Rust code.
a. I tried to get around the limitation by also using allocation stats from other libraries in Python. PyArrow has some stats on it's global memory pool. But most of those stats can't be reset like our stats, so there wasn't a clear way to combine them. - What I finally settled on was
memtest, using theLD_PRELOADtrick with a python library. This captures all allocations reliably and because it's run in Python it doesn't need to worry about concurrency.
That's all the attempts I made I can think of. If you have any new ideas, I'd be glad to hear them.
Thank you very much for this! I initially assumed that most of our workload is handled within the Rust core, so the Python part wouldn't require much attention. However, it seems my assumption was incorrect. Could you elaborate on why we need to consider Python's memory usage as well? From my current understanding, if we're building an online service around lancedb, operations like building the index should be handled server-side in Rust, while users would primarily use Python on the client side. |
This is the Lance library, not LanceDB. Some operations are handled entirely in Rust, like indexing and queries. But for writes, the data comes from outside. Testing that we can stream writes properly is one of the main things we want to test. That's why I gave the example of insert earlier: it tests that we can take a stream of data, and write it out without collecting it all into memory. |
|
Thanks! The great answer addressed my questions. I'm now going to close this PR as I do think the current way seems to be the only way.
One last question: is it a good idea to measure that lance memory usage didn't grow up during writing? So we can still measure the rust side instead. |
|
Let's close. |
This PR provides an alternative implementation for
memtest. In this version, we introducememtraceas a new feature in Pylance. Users can enable it via thememtracefeature flag. It offers a similar API tomemtest, but it eliminates the need for users to hook or inject dynamic libraries, making it easier to use and test.Before:
After:
Parts of this PR were drafted with assistance from Codex (with
gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.