stepfun-ai/Step-3.5-Flash-GGUF-Q4_K_S · chat template is broken

6 days ago

Hello.
I'm excited to appearance stepfun's official GGUF.
But, if you load GGUF normally, the model is strange.

For example

The model don't think
The model speak unnatural Japanese
The model don't stop

However, if you load GGUF with chat template file (--chat-template-file "D:\Step-3.5-Flash\chat_template.jinja") the model is normal.
Is this GGUF's template broken?

SlavikF

5 days ago

Interesting. Where do find that chat_template.jinja?

grapevine-AI

5 days ago

Oh, sorry.
I got chat_template.jinja in stepfun's official HuggingFace repository

SlavikF

5 days ago

Thank you!

Without --chat-template-file - I get no reasoning at all or little bit of reasoning without proper tags:

And with --chat-template-file - I get correct reasoning:

Sometime, I get too much reasoning, but usually it's ok.

exxocism

5 days ago

With this jinja template when calling tools I get this exception:

srv    operator(): got exception: {"error":{"code":500,"message":"\n------------\nWhile executing FilterExpression at line 55, column 63 in source:\n...- for args_name, args_value in arguments|items %}↵                        {{- '<...\n                                           ^\nError: Unknown (built-in) filter 'items' for type String","type":"server_error"}}

Looks like the jinja template has a bug, I worked it around with this fixed jinja template.

diff --git a/jinja_template_for_arch_step35.jinja b/jinja_template_for_arch_step35.jinja
index c09ea497d..ca3817b2d 100644
--- a/jinja_template_for_arch_step35.jinja
+++ b/jinja_template_for_arch_step35.jinja
@@ -51,13 +51,17 @@
                 {%- endif %}
                 {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
                 {%- if tool_call.arguments is defined %}
-                    {%- set arguments = tool_call.arguments %}
-                    {%- for args_name, args_value in arguments|items %}
+                    {%- if tool_call.arguments is mapping %}
+                        {%- for args_name, args_value in tool_call.arguments|items %}
                         {{- '<parameter=' + args_name + '>\n' }}
                         {%- set args_value = args_value | tojson(ensure_ascii=False) | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
                         {{- args_value }}
                         {{- '\n</parameter>\n' }}
-                    {%- endfor %}
+                        {%- endfor %}
+                    {%- else %}
+                        {#- arguments is string (JSON from server) - output as single parameter block #}
+                        {{- '<parameter=arguments>\n' + (tool_call.arguments | string) + '\n</parameter>\n' }}
+                    {%- endif %}
                 {%- endif %}
                 {{- '</function>\n</tool_call>' }}
             {%- endfor %}

{% macro render_content(content) %}{% if content is none %}{{- '' }}{% elif content is string %}{{- content }}{% elif content is mapping %}{{- content['value'] if 'value' in content else content['text'] }}{% elif content is iterable %}{% for item in content %}{% if item.type == 'text' %}{{- item['value'] if 'value' in item else item['text'] }}{% elif item.type == 'image' %}<im_patch>{% endif %}{% endfor %}{% endif %}{% endmacro %}
{{bos_token}}{%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0].role == 'system' %}
        {{- render_content(messages[0].content) + '\n\n' }}
    {%- endif %}
    {{- "# Tools\n\nYou have access to the following functions in JSONSchema format:\n\n<tools>" }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson(ensure_ascii=False) }}
    {%- endfor %}
    {{- "\n</tools>\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...>\n...\n</function> block must be nested within <tool_call>\n...\n</tool_call> XML tags\n- Required parameters MUST be specified\n</IMPORTANT><|im_end|>\n" }}
{%- else %}
    {%- if messages[0].role == 'system' %}
        {{- '<|im_start|>system\n' + render_content(messages[0].content) + '<|im_end|>\n' }}
    {%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
    {%- set index = (messages|length - 1) - loop.index0 %}
    {%- if ns.multi_step_tool and message.role == "user" and render_content(message.content) is string and not(render_content(message.content).startswith('<tool_response>') and render_content(message.content).endswith('</tool_response>')) %}
        {%- set ns.multi_step_tool = false %}
        {%- set ns.last_query_index = index %}
    {%- endif %}
{%- endfor %}
{%- for message in messages %}
    {%- set content = render_content(message.content) %}
    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
        {%- set role_name = 'observation' if (message.role == "system" and not loop.first and message.name == 'observation') else message.role %}
        {{- '<|im_start|>' + role_name + '\n' + content + '<|im_end|>' + '\n' }}
    {%- elif message.role == "assistant" %}
        {%- if message.reasoning_content is string %}
            {%- set reasoning_content = render_content(message.reasoning_content) %}
        {%- else %}
            {%- if '</think>' in content %}
                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
            {%- else %}
                {%- set reasoning_content = '' %}
            {%- endif %}
        {%- endif %}
        {%- if loop.index0 > ns.last_query_index %}
            {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n' + content }}
        {%- else %}
            {{- '<|im_start|>' + message.role + '\n' + content }}
        {%- endif %}
        {%- if message.tool_calls %}
            {%- for tool_call in message.tool_calls %}
                {%- if tool_call.function is defined %}
                    {%- set tool_call = tool_call.function %}
                {%- endif %}
                {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
                {%- if tool_call.arguments is defined %}
                    {%- set arguments = tool_call.arguments %}
                    {%- for args_name, args_value in arguments|items %}
                        {{- '<parameter=' + args_name + '>\n' }}
                        {%- set args_value = args_value | tojson(ensure_ascii=False) | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
                        {{- args_value }}
                        {{- '\n</parameter>\n' }}
                    {%- endfor %}
                {%- endif %}
                {{- '</function>\n</tool_call>' }}
            {%- endfor %}
        {%- endif %}
        {{- '<|im_end|>\n' }}
    {%- elif message.role == "tool" %}
        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
            {{- '<|im_start|>tool_response\n' }}
        {%- endif %}
        {{- '<tool_response>' }}
        {{- content }}
        {{- '</tool_response>' }}
        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
            {{- '<|im_end|>\n' }}
        {%- endif %}
    {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n<think>\n' }}
{%- endif %}

exxocism

4 days ago

Turns out that we can make use of autoparser branch: https://github.com/pwilkin/llama.cpp/tree/autoparser

Qnibbles

4 days ago

I also get this FilterExpression at line 55, column 63 error when tools are called in llama.cpp. Using the patch from @exxocism did not help - the model fails to generate output at all. Unclear why. I'll dig into it a bit more, but its clear there is something wrong with the currently distributed template.

Qnibbles

4 days ago

The patch of @exxocism does work; updating to the b7972 branch of llama.cpp fixed the freezing. There are other people experiencing the same problem; this patch fixes the error but adds a lot of noise to the output when tools are called: https://huggingface.co/ubergarm/Step-3.5-Flash-GGUF/discussions/1#69878ca7ae66ac235fc2ca95

apohelios

StepFun org 4 days ago

Sry I accidentally specified the wrong minja when converting the model earlier. I’ve re-uploaded it. Now model can think.

Qnibbles

3 days ago

@apohelios where can we find the updated chat template? This one is broken with tools: https://huggingface.co/stepfun-ai/Step-3.5-Flash/blob/main/chat_template.jinja

apohelios

StepFun org about 21 hours ago

@Qnibbles Tool calling issues here aren’t caused by the chat template — it’s a known issue on llama.cpp mainline. You can try this PR which fixes the tool call problems:
https://github.com/ggml-org/llama.cpp/pull/18675