Skip to content

Conversation

@LuFinch
Copy link
Contributor

@LuFinch LuFinch commented Nov 27, 2025

I missed a else when launch kernel so that it launchs kernel twice on PVC..

Copilot AI review requested due to automatic review settings November 27, 2025 06:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a performance regression on PVC by correcting a missing else statement that caused a kernel to be launched twice. The PR also adds missing CUTLASS_DEVICE annotations to device functions.

  • Fixed conditional branching for kernel launch based on subgroup size
  • Added CUTLASS_DEVICE annotations to device-only functions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/ATen/native/transformers/xpu/flash_attn/sycltla/mha_fwd.cpp Fixed missing else statement that was causing duplicate kernel launches
src/ATen/native/transformers/xpu/flash_attn/sycltla/kernel/xe_sdpa_fwd_bshd.h Added CUTLASS_DEVICE annotations to device functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@LuFinch LuFinch changed the title [SYCLTLA] Fix performance on PVC [SYCLTLA] Fix FlashAttention FWD performance on PVC Nov 27, 2025
@LuFinch LuFinch requested a review from EikanWang November 27, 2025 06:29
@LuFinch
Copy link
Contributor Author

LuFinch commented Nov 28, 2025

Can we merge this PR?

@EikanWang
Copy link
Contributor

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants