Eval bug: Qwen3-VL MoE tool calls during reasoning are often not parsed properly

### Name and Version

Commit 86a3f0fad8b153ac9396e1ac18e790e4179c53f2 + https://github.com/ggml-org/llama.cpp/pull/17750

### Operating systems

Linux

### GGML backends

CUDA

### Hardware

NVIDIA L40S

### Models

[unsloth/Qwen3-VL-30B-A3B-Thinking-1M-GGUF:IQ3_XXS](https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Thinking-1M-GGUF/blob/main/Qwen3-VL-30B-A3B-Thinking-1M-UD-IQ3_XXS.gguf)

### Problem description & steps to reproduce

Here is an example of payload that can trigger the bug. You may have to restart the request several times to make the issue occur.
[request_body(14).json](https://github.com/user-attachments/files/24099763/request_body.14.json)


### First Bad Commit

_No response_

### Relevant log output

```shell
N/A
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: Qwen3-VL MoE tool calls during reasoning are often not parsed properly #17932

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: Qwen3-VL MoE tool calls during reasoning are often not parsed properly #17932

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions