[ATOM SGL] update fp8 prefill argument passing by ZhiweiYan-96 · Pull Request #1211 · ROCm/ATOM

ZhiweiYan-96 · 2026-06-15T03:33:42Z

Motivation

FP8 prefill fails using new atier due to API mismatch (exposed in https://github.com/ROCm/ATOM/actions/runs/27469822009/job/81199035902#logs). This PR fix the issue using the new API

Test Result

export AITER_QUICK_REDUCE_QUANTIZATION=INT4
export SGLANG_USE_AITER=1
export ATOM_ENABLE_DS_QKNORM_QUANT_FUSION=1
export SGLANG_EXTERNAL_MODEL_PACKAGE=atom.plugin.sglang.models
export SGLANG_ENABLE_TORCH_COMPILE=1
export SGLANG_AITER_FP8_PREFILL_ATTN=1
python3 -m sglang.launch_server \
  --model-path /workspace/shared/data/amd_int/models/deepseek-ai/DeepSeek-R1-0528-MXFP4-v2 \
  --host localhost --port 8000 \
  --trust-remote-code \
  --tensor-parallel-size 4 \
  --attention-backend aiter \
  --kv-cache-dtype fp8_e4m3 \
  --mem-fraction-static 0.85 \
  --page-size 1 \
  --disable-radix-cache

gsm8k results

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

Updates ATOM’s SGLang full-attention FP8 prefill path to pass an additional num_kv_splits argument into the MLA reduction kernel, deriving it from the reduce indptr metadata.

Changes:

Add _max_reduce_group_size(reduce_indptr) helper to compute the maximum per-output reduce group size.
Compute num_kv_splits from ForwardMetadata.reduce_indptr during FP8 prefill.
Pass num_kv_splits into mla_reduce_v1(...) to match the updated FP8 prefill reduction calling convention.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

[ATOM SGL] update fp8 prefill argument passing

8aa1242

ZhiweiYan-96 marked this pull request as ready for review June 15, 2026 03:33

Copilot AI review requested due to automatic review settings June 15, 2026 03:33

Copilot started reviewing on behalf of ZhiweiYan-96 June 15, 2026 03:34 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

ZhiweiYan-96 added 2 commits June 15, 2026 09:00

use simpler setting

b9914e1

precheckin

555581a

Copilot AI review requested due to automatic review settings June 15, 2026 13:45

Copilot started reviewing on behalf of ZhiweiYan-96 June 15, 2026 13:47 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

valarLip approved these changes Jun 15, 2026

View reviewed changes

zhuyuhua-v merged commit 50187a0 into ROCm:main Jun 16, 2026
21 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ATOM SGL] update fp8 prefill argument passing#1211

[ATOM SGL] update fp8 prefill argument passing#1211
zhuyuhua-v merged 3 commits into
ROCm:mainfrom
ZhiweiYan-96:zhiwei/fp8_prefill_aiter_update

ZhiweiYan-96 commented Jun 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ZhiweiYan-96 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ZhiweiYan-96 commented Jun 15, 2026 •

edited

Loading