What should the tool-call-parser be set to when running the model based on vllm?

#6
by wolfzr - opened

I tested and found that it does not match the common tool call parsing formats

Multilingual-Multimodal-NLP org

Hi, you can try setting --tool-call-parser qwen3_xml. This should work for the tool call parsing format used by this model.

Example:

vllm serve /path/to/your/model \
    --port 8080 \
    --tensor-parallel-size 1 \
    --data-parallel-size 8 \
    --served-model-name InCoder-32B \
    --disable-log-requests \
    --max-model-len 131072 \
    --gpu-memory-utilization 0.9 \
    --trust-remote-code \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_xml

Sign up or log in to comment