Summary
The longlive pipeline is crashing with a PEFT LoRA matrix multiplication shape mismatch (768×1536 vs 5120×32) during chunk inference. This occurs deep in the self-attention QKV projection when a LoRA adapter is active, indicating the LoRA rank/target module configuration is incompatible with the current model's projection dimensions.
cc @mjh1 @emranemran
Error Messages
Error in block: (denoise, DenoiseBlock)
Error details: mat1 and mat2 shapes cannot be multiplied (768x1536 and 5120x32)
scope.server.pipeline_processor - ERROR - [067a55be] Error processing chunk for longlive: mat1 and mat2 shapes cannot be multiplied (768x1536 and 5120x32)
Stack Trace
File "/app/src/scope/server/pipeline_processor.py", line 475, in process_chunk
output_dict = self.pipeline(**call_params)
File "/app/src/scope/core/pipelines/longlive/pipeline.py", line 209, in __call__
return self._generate(**kwargs)
File "/app/src/scope/core/pipelines/longlive/pipeline.py", line 250, in _generate
_, self.state = self.blocks(self.components, self.state)
File "/app/.venv/lib/python3.12/site-packages/diffusers/modular_pipelines/modular_pipeline.py", line 932, in __call__
pipeline, state = block(pipeline, state)
File "/app/src/scope/core/pipelines/wan2_1/blocks/denoise.py", line 185, in __call__
_, denoised_pred = components.generator(...)
File "/app/src/scope/core/pipelines/wan2_1/components/generator.py", line 207, in _call_model
return self.model(*args, **accepted)
File "/app/src/scope/core/pipelines/longlive/modules/causal_model.py", line 1425, in forward
File "/app/src/scope/core/pipelines/longlive/modules/causal_model.py", line 1206, in _forward_inference
File "/app/src/scope/core/pipelines/longlive/modules/causal_model.py", line 508, in forward
self_attn_result = self.self_attn(...)
File "/app/src/scope/core/pipelines/longlive/modules/causal_model.py", line 132, in forward
File "/app/src/scope/core/pipelines/longlive/modules/causal_model.py", line 127, in qkv_fn
File "/app/.venv/lib/python3.12/site-packages/peft/tuners/lora/layer.py", line 807, in forward
File "/app/.venv/lib/python3.12/site-packages/torch/nn/modules/linear.py", line 134, in forward
RuntimeError: mat1 and mat2 shapes cannot be multiplied (768x1536 and 5120x32)
Root Cause Analysis
The error originates in peft/tuners/lora/layer.py — the PEFT LoRA down-projection linear layer receives an input of shape 768×1536 but its weight matrix is 5120×32 (rank-32, 5120-dim model). This means:
- The model's QKV projection input dimension is 1536 (e.g. heads × head_dim from the LongLive/Wan2.1-1.3B architecture)
- The LoRA was trained/configured expecting an input dimension of 5120 (the full Wan2.1-5B architecture)
The LoRA rank-32 adapter was trained for the 5B parameter variant but is being loaded into the 1.3B model. Despite the LoRA file loading successfully, the adapter dimensions are incompatible at runtime.
Session Context
Session 067a55be had loaded params:
{
"loras": [{"path": "/tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors", "scale": 2, "merge_mode": "permanent_merge"}],
"lora_merge_mode": "permanent_merge"
}
The LoRA loaded successfully (log: "load_adapter: Loaded adapter 'SUPERSUISH_LoRA_V1_000000750' in 0.407s") but then fails at first inference.
Frequency (last 12h, 2026-04-12 06:09 – 18:09 UTC)
- ~156 occurrences in session
067a55be
- Time window: 14:41–14:55 UTC
- App:
github_f1lhgmk5v76a0ev1w0u378by-scope-app--prod
Impact
The pipeline produces no output for the duration of the session while continuing to consume GPU resources.
Suggested Fix
- Dimension validation at LoRA load time: Check that the LoRA's down-projection input dimension matches the model's hidden dim. If mismatched, reject with a user-friendly error instead of loading and failing at inference.
- Architecture detection: The LoRA loader (
peft_lora.py) should detect whether the LoRA was trained for 1.3B vs 5B and refuse incompatible adapters.
- User-facing message: Surface something like "LoRA 'SUPERSUISH_LoRA_V1_000000750' is incompatible with the selected model size" rather than a cryptic runtime crash.
Summary
The
longlivepipeline is crashing with a PEFT LoRA matrix multiplication shape mismatch (768×1536 vs 5120×32) during chunk inference. This occurs deep in the self-attention QKV projection when a LoRA adapter is active, indicating the LoRA rank/target module configuration is incompatible with the current model's projection dimensions.cc @mjh1 @emranemran
Error Messages
Stack Trace
Root Cause Analysis
The error originates in
peft/tuners/lora/layer.py— the PEFT LoRA down-projection linear layer receives an input of shape768×1536but its weight matrix is5120×32(rank-32, 5120-dim model). This means:The LoRA rank-32 adapter was trained for the 5B parameter variant but is being loaded into the 1.3B model. Despite the LoRA file loading successfully, the adapter dimensions are incompatible at runtime.
Session Context
Session
067a55behad loaded params:{ "loras": [{"path": "/tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors", "scale": 2, "merge_mode": "permanent_merge"}], "lora_merge_mode": "permanent_merge" }The LoRA loaded successfully (log: "load_adapter: Loaded adapter 'SUPERSUISH_LoRA_V1_000000750' in 0.407s") but then fails at first inference.
Frequency (last 12h, 2026-04-12 06:09 – 18:09 UTC)
067a55begithub_f1lhgmk5v76a0ev1w0u378by-scope-app--prodImpact
The pipeline produces no output for the duration of the session while continuing to consume GPU resources.
Suggested Fix
peft_lora.py) should detect whether the LoRA was trained for 1.3B vs 5B and refuse incompatible adapters.