Refactor KV Cache Manager with a new torch exportable KVCache backend by geoffreyQiu · Pull Request #400 · NVIDIA/recsys-examples

geoffreyQiu · 2026-05-20T03:56:19Z

Description

Refactor KV Cache Manager with a new torch exportable KVCache backend

Introduce KVCacheBackend, split the default Python backend from the public KVCacheManager facade
Implement KVCacheManagerContext - creation, registry, reference.
Implement the exportable lookup_kvcache and allocate_kvcache.
Implement the exportable onboard_kvcache_launch and onboard_kvcachce_wait.
Implement the exportable offload_kvcache_launch and offload_kvcache_reap_completed (replacing offload_try_wait).

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

Introduce KVCacheBackend, split the default Python backend from the public KVCacheManager facade, and add the initial export lookup path.

geoffreyQiu added 3 commits May 19, 2026 09:31

Refactor KV cache backend split

f94bcc0

Introduce KVCacheBackend, split the default Python backend from the public KVCacheManager facade, and add the initial export lookup path.

Add export kvcache design outlines (temp)

c633495

Support AOTI-compatible recsys kvcache ops

3b56dea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor KV Cache Manager with a new torch exportable KVCache backend#400

Refactor KV Cache Manager with a new torch exportable KVCache backend#400
geoffreyQiu wants to merge 3 commits into
NVIDIA:mainfrom
geoffreyQiu:aoti_kvcache

geoffreyQiu commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

geoffreyQiu commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

geoffreyQiu commented May 20, 2026 •

edited

Loading