Skip to content

Refactor KV Cache Manager with a new torch exportable KVCache backend#400

Draft
geoffreyQiu wants to merge 3 commits into
NVIDIA:mainfrom
geoffreyQiu:aoti_kvcache
Draft

Refactor KV Cache Manager with a new torch exportable KVCache backend#400
geoffreyQiu wants to merge 3 commits into
NVIDIA:mainfrom
geoffreyQiu:aoti_kvcache

Conversation

@geoffreyQiu

@geoffreyQiu geoffreyQiu commented May 20, 2026

Copy link
Copy Markdown
Collaborator

Description

Refactor KV Cache Manager with a new torch exportable KVCache backend

  • Introduce KVCacheBackend, split the default Python backend from the public KVCacheManager facade
  • Implement KVCacheManagerContext - creation, registry, reference.
  • Implement the exportable lookup_kvcache and allocate_kvcache.
  • Implement the exportable onboard_kvcache_launch and onboard_kvcachce_wait.
  • Implement the exportable offload_kvcache_launch and offload_kvcache_reap_completed (replacing offload_try_wait).

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Introduce KVCacheBackend, split the default Python backend from the public KVCacheManager facade, and add the initial export lookup path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant