enh(hotspot_analyzer): add --kernel filter for CSV metadata matching#657
Draft
Arist12 wants to merge 1 commit into
Draft
enh(hotspot_analyzer): add --kernel filter for CSV metadata matching#657Arist12 wants to merge 1 commit into
Arist12 wants to merge 1 commit into
Conversation
The existing CSV row-selection heuristic matches by comparing the dispatch
directory basename against Kernel_Name in the kernel trace CSV. This works
for rocprofv3's timestamped output (e.g. 20240101_120000_pa_decode_kernel),
but fails completely for the ui_output_agent_<N>_dispatch_<id> layout
produced by rocprofv3's ATT decode step — the basename carries no kernel
name, only agent and dispatch numbers.
When metadata lookup fails the analyzer falls back to ISA-estimated register
counts and prints a warning, silently under-reporting VGPR, SGPR, LDS, and
occupancy for every ui_output_agent_* trace.
Fix by adding a --kernel SUBSTR option that enables an explicit row-selection
path:
1. Substrings-matches Kernel_Name against the supplied filter.
2. If the CSV has a Dispatch_Id column and the directory name encodes
dispatch_<id>, also requires the row's Dispatch_Id to match — avoiding
false matches when a PyTorch reference kernel shares the same name prefix.
3. Falls back gracefully to kernel-name-only matching when Dispatch_Id is
absent from the CSV.
The legacy heuristic is unchanged and still used when --kernel is not given,
so existing timestamped-dir workflows are unaffected.
Update the "not matched" warning to tell users about --kernel so the fix is
discoverable without reading source.
Example:
python hotspot_analyzer.py ui_output_agent_15249_dispatch_223 \
--topk 8 --mode src --detail \
--kernel pa_mqa_logits_fp4_kernel_0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
hotspot_analyzer.pyreads authoritative VGPR/SGPR/LDS/occupancy data fromthe
*_kernel_trace.csvfile written byrocprofv3 --kernel-trace. Toselect the correct row it tries to match
Kernel_Nameagainst the dispatchdirectory basename.
This heuristic works for timestamped output directories
(
20240101_120000_pa_decode_kernel) but fails completely for theui_output_agent_<N>_dispatch_<id>layout produced by rocprofv3's ATTdecode step. In that layout the directory basename carries only an agent
number and a dispatch counter — no kernel name — so every kernel name
comparison returns false and the metadata lookup silently returns
{}.The result is that the "Register Pressure & Occupancy" section uses ISA
estimates instead of the real CSV values for all
ui_output_agent_*traces,and the warning message gave no hint about how to fix it.
Solution
Add
--kernel SUBSTR(optional, default""):Kernel_Nameinstead ofthe dir-name heuristic.
*_kernel_trace.csvhas aDispatch_Idcolumn and thedirectory name encodes
dispatch_<id>, the row must also match on dispatchid. This prevents false matches when a PyTorch reference kernel shares the
same name prefix as the target kernel and runs in the same profiling session.
Dispatch_Idcolumn.The legacy heuristic (dir basename vs Kernel_Name bidirectional substring) is
unchanged and still used when
--kernelis not given, so existingtimestamped-dir workflows are unaffected.
The "not matched" warning now mentions
--kernelso users can discover thefix without reading source.
Before / after
Testing
Five unit tests covering:
ui_output_agent_*dir without--kernelreturns{}(expected).--kernel+Dispatch_Idcolumn selects the correct CSV row.--kernelwithoutDispatch_Idcolumn falls back to name-only match.argparsewires--kernelthrough toread_kernel_metadata.All five pass.