Fix full-attention layer id mapping for Hybrid models (e.g. Qwen3-Next/Qwen3.5) by wanzhenchn · Pull Request #249 · zejunchen-zejun/sglang

wanzhenchn · 2026-04-14T02:39:57Z

Motivation

Hybrid models (e.g. Qwen3-Next/Qwen3.5): KV pool only exists for full-attention layers, not layer 0. Pick any mapped full-attention layer id to get the v_head_dim

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

…n3-Next/Qwen3.5)

wanzhenchn requested a review from qichu-yun April 14, 2026 02:40

[Fix] fix full-attention layer id mapping for Hybrid models (e.g. Qwe…

10d1179

…n3-Next/Qwen3.5)

wanzhenchn force-pushed the fix-attention_layer_id_mapping branch from 8dc5a2e to 10d1179 Compare April 14, 2026 02:51

wanzhenchn requested a review from yixionghuo April 14, 2026 02:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix full-attention layer id mapping for Hybrid models (e.g. Qwen3-Next/Qwen3.5)#249

Fix full-attention layer id mapping for Hybrid models (e.g. Qwen3-Next/Qwen3.5)#249
wanzhenchn wants to merge 1 commit into
Qwen3.5_v0.5.9from
fix-attention_layer_id_mapping

wanzhenchn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wanzhenchn commented Apr 14, 2026

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant