Skip to content

[draft](eplb): add per-layer expert-load statistics monitor for EP path#1210

Draft
JiaoliangYu wants to merge 1 commit into
ROCm:mainfrom
JiaoliangYu:feat/eplb-expert-load-pass
Draft

[draft](eplb): add per-layer expert-load statistics monitor for EP path#1210
JiaoliangYu wants to merge 1 commit into
ROCm:mainfrom
JiaoliangYu:feat/eplb-expert-load-pass

Conversation

@JiaoliangYu

Copy link
Copy Markdown
Contributor

Collect per-layer, per-expert token counts from the MORI EP dispatch output (dispatch_recv_token_num) into a windowed ExpertLoadMonitor. Logs avg/max/balancedness and can emit a one-shot offline rebalance plan (hot/cold experts) via the offline_eplb_rebalance utility command.

Scope is statistics only: per-rank, no cross-rank all-reduce, and no actual expert weight remap/transfer yet. All gated behind ATOM_ENABLE_EPLB_LOAD_STATS (default off).

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Collect per-layer, per-expert token counts from the MORI EP dispatch
output (dispatch_recv_token_num) into a windowed ExpertLoadMonitor.
Logs avg/max/balancedness and can emit a one-shot offline rebalance
plan (hot/cold experts) via the offline_eplb_rebalance utility command.

Scope is statistics only: per-rank, no cross-rank all-reduce, and no
actual expert weight remap/transfer yet. All gated behind
ATOM_ENABLE_EPLB_LOAD_STATS (default off).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant