Text Generation
Transformers
GGUF
step3p5
custom_code
imatrix
conversational

chat template is broken

#12
by grapevine-AI - opened

Hello.
I'm excited to appearance stepfun's official GGUF.
But, if you load GGUF normally, the model is strange.

For example

  • The model don't think
  • The model speak unnatural Japanese
  • The model don't stop

However, if you load GGUF with chat template file (--chat-template-file "D:\Step-3.5-Flash\chat_template.jinja") the model is normal.
Is this GGUF's template broken?

Interesting. Where do find that chat_template.jinja?

Oh, sorry.
I got chat_template.jinja in stepfun's official HuggingFace repository

Thank you!

Without --chat-template-file - I get no reasoning at all or little bit of reasoning without proper tags:

Screenshot 2026-02-07 at 16.08.22

And with --chat-template-file - I get correct reasoning:

Screenshot 2026-02-07 at 16.11.27

Sometime, I get too much reasoning, but usually it's ok.

  1. With this jinja template when calling tools I get this exception:
srv    operator(): got exception: {"error":{"code":500,"message":"\n------------\nWhile executing FilterExpression at line 55, column 63 in source:\n...- for args_name, args_value in arguments|items %}↵                        {{- '<...\n                                           ^\nError: Unknown (built-in) filter 'items' for type String","type":"server_error"}}
  1. Looks like the jinja template has a bug, I worked it around with this fixed jinja template.
diff --git a/jinja_template_for_arch_step35.jinja b/jinja_template_for_arch_step35.jinja
index c09ea497d..ca3817b2d 100644
--- a/jinja_template_for_arch_step35.jinja
+++ b/jinja_template_for_arch_step35.jinja
@@ -51,13 +51,17 @@
                 {%- endif %}
                 {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
                 {%- if tool_call.arguments is defined %}
-                    {%- set arguments = tool_call.arguments %}
-                    {%- for args_name, args_value in arguments|items %}
+                    {%- if tool_call.arguments is mapping %}
+                        {%- for args_name, args_value in tool_call.arguments|items %}
                         {{- '<parameter=' + args_name + '>\n' }}
                         {%- set args_value = args_value | tojson(ensure_ascii=False) | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
                         {{- args_value }}
                         {{- '\n</parameter>\n' }}
-                    {%- endfor %}
+                        {%- endfor %}
+                    {%- else %}
+                        {#- arguments is string (JSON from server) - output as single parameter block #}
+                        {{- '<parameter=arguments>\n' + (tool_call.arguments | string) + '\n</parameter>\n' }}
+                    {%- endif %}
                 {%- endif %}
                 {{- '</function>\n</tool_call>' }}
             {%- endfor %}
{% macro render_content(content) %}{% if content is none %}{{- '' }}{% elif content is string %}{{- content }}{% elif content is mapping %}{{- content['value'] if 'value' in content else content['text'] }}{% elif content is iterable %}{% for item in content %}{% if item.type == 'text' %}{{- item['value'] if 'value' in item else item['text'] }}{% elif item.type == 'image' %}<im_patch>{% endif %}{% endfor %}{% endif %}{% endmacro %}
{{bos_token}}{%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0].role == 'system' %}
        {{- render_content(messages[0].content) + '\n\n' }}
    {%- endif %}
    {{- "# Tools\n\nYou have access to the following functions in JSONSchema format:\n\n<tools>" }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson(ensure_ascii=False) }}
    {%- endfor %}
    {{- "\n</tools>\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...>\n...\n</function> block must be nested within <tool_call>\n...\n</tool_call> XML tags\n- Required parameters MUST be specified\n</IMPORTANT><|im_end|>\n" }}
{%- else %}
    {%- if messages[0].role == 'system' %}
        {{- '<|im_start|>system\n' + render_content(messages[0].content) + '<|im_end|>\n' }}
    {%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
    {%- set index = (messages|length - 1) - loop.index0 %}
    {%- if ns.multi_step_tool and message.role == "user" and render_content(message.content) is string and not(render_content(message.content).startswith('<tool_response>') and render_content(message.content).endswith('</tool_response>')) %}
        {%- set ns.multi_step_tool = false %}
        {%- set ns.last_query_index = index %}
    {%- endif %}
{%- endfor %}
{%- for message in messages %}
    {%- set content = render_content(message.content) %}
    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
        {%- set role_name = 'observation' if (message.role == "system" and not loop.first and message.name == 'observation') else message.role %}
        {{- '<|im_start|>' + role_name + '\n' + content + '<|im_end|>' + '\n' }}
    {%- elif message.role == "assistant" %}
        {%- if message.reasoning_content is string %}
            {%- set reasoning_content = render_content(message.reasoning_content) %}
        {%- else %}
            {%- if '</think>' in content %}
                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
            {%- else %}
                {%- set reasoning_content = '' %}
            {%- endif %}
        {%- endif %}
        {%- if loop.index0 > ns.last_query_index %}
            {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n' + content }}
        {%- else %}
            {{- '<|im_start|>' + message.role + '\n' + content }}
        {%- endif %}
        {%- if message.tool_calls %}
            {%- for tool_call in message.tool_calls %}
                {%- if tool_call.function is defined %}
                    {%- set tool_call = tool_call.function %}
                {%- endif %}
                {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
                {%- if tool_call.arguments is defined %}
                    {%- set arguments = tool_call.arguments %}
                    {%- for args_name, args_value in arguments|items %}
                        {{- '<parameter=' + args_name + '>\n' }}
                        {%- set args_value = args_value | tojson(ensure_ascii=False) | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
                        {{- args_value }}
                        {{- '\n</parameter>\n' }}
                    {%- endfor %}
                {%- endif %}
                {{- '</function>\n</tool_call>' }}
            {%- endfor %}
        {%- endif %}
        {{- '<|im_end|>\n' }}
    {%- elif message.role == "tool" %}
        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
            {{- '<|im_start|>tool_response\n' }}
        {%- endif %}
        {{- '<tool_response>' }}
        {{- content }}
        {{- '</tool_response>' }}
        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
            {{- '<|im_end|>\n' }}
        {%- endif %}
    {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n<think>\n' }}
{%- endif %}

Turns out that we can make use of autoparser branch: https://github.com/pwilkin/llama.cpp/tree/autoparser

I also get this FilterExpression at line 55, column 63 error when tools are called in llama.cpp. Using the patch from @exxocism did not help - the model fails to generate output at all. Unclear why. I'll dig into it a bit more, but its clear there is something wrong with the currently distributed template.

The patch of @exxocism does work; updating to the b7972 branch of llama.cpp fixed the freezing. There are other people experiencing the same problem; this patch fixes the error but adds a lot of noise to the output when tools are called: https://huggingface.co/ubergarm/Step-3.5-Flash-GGUF/discussions/1#69878ca7ae66ac235fc2ca95

StepFun org

Sry I accidentally specified the wrong minja when converting the model earlier. I’ve re-uploaded it. Now model can think.

@apohelios where can we find the updated chat template? This one is broken with tools: https://huggingface.co/stepfun-ai/Step-3.5-Flash/blob/main/chat_template.jinja

@Qnibbles Tool calling issues here aren’t caused by the chat template — it’s a known issue on llama.cpp mainline. You can try this PR which fixes the tool call problems:
https://github.com/ggml-org/llama.cpp/pull/18675

Sign up or log in to comment