Skip to content

Fix full-attention layer id mapping for Hybrid models (e.g. Qwen3-Next/Qwen3.5)#249

Open
wanzhenchn wants to merge 1 commit into
Qwen3.5_v0.5.9from
fix-attention_layer_id_mapping
Open

Fix full-attention layer id mapping for Hybrid models (e.g. Qwen3-Next/Qwen3.5)#249
wanzhenchn wants to merge 1 commit into
Qwen3.5_v0.5.9from
fix-attention_layer_id_mapping

Conversation

@wanzhenchn

Copy link
Copy Markdown
Collaborator

Motivation

Hybrid models (e.g. Qwen3-Next/Qwen3.5): KV pool only exists for full-attention layers, not layer 0. Pick any mapped full-attention layer id to get the v_head_dim

image

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@wanzhenchn wanzhenchn requested a review from qichu-yun April 14, 2026 02:40
@wanzhenchn wanzhenchn force-pushed the fix-attention_layer_id_mapping branch from 8dc5a2e to 10d1179 Compare April 14, 2026 02:51
@wanzhenchn wanzhenchn requested a review from yixionghuo April 14, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant