Upload README.md with huggingface_hub

4f6ad08 verified 1 day ago

4.27 kB

language:
  - en
license: gemma
library_name: transformers
base_model: google/gemma-3n-E2B-it
tags:
  - function-calling
  - tool-use
  - on-device
  - mobile
  - gemma
  - litertlm

Agent Gemma — Gemma 3n E2B Fine-Tuned for Function Calling

A fine-tuned version of google/gemma-3n-E2B-it trained for on-device function calling using Google's FunctionGemma technique.

What's Different from Stock Gemma 3n

Fixed: `format_function_declaration` Template Error

The stock Gemma 3n chat template uses format_function_declaration() — a custom Jinja2 function available in Google's Python tokenizer but not supported by LiteRT-LM's on-device template engine. This causes:

Failed to apply template: unknown function: format_function_declaration is unknown (in template:21)

This model replaces the stock template with a LiteRT-LM compatible template that uses only standard Jinja2 features (tojson filter, <start_function_declaration> / <end_function_declaration> markers). The template is embedded in both tokenizer_config.json and chat_template.jinja.

Function Calling Format

The model uses the FunctionGemma markup format:

<start_function_call>call:function_name{param:<escape>value<escape>}<end_function_call>

Tool declarations are formatted as:

<start_function_declaration>{"name": "get_weather", "parameters": {...}}<end_function_declaration>

Training Details

Base model: google/gemma-3n-E2B-it (5.4B parameters)
Method: QLoRA (rank=16, alpha=32) — 22.9M trainable parameters (0.42%)
Dataset: google/mobile-actions (8,693 training samples)
Training: 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4
Precision: bfloat16

Usage

With LiteRT-LM on Android (Kotlin)

// After converting to .litertlm format
val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm"))
engine.initialize()

val conversation = engine.createConversation(
    ConversationConfig(
        systemMessage = Message.of("You are a helpful assistant."),
        tools = listOf(MyToolSet())  // @Tool annotated class
    )
)

// No format_function_declaration error!
conversation.sendMessageAsync(Message.of("What's the weather?"))
    .collect { print(it) }

With Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma")
tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma")

messages = [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather in Tokyo?"}
]

tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))

Chat Template

The custom chat template (in tokenizer_config.json and chat_template.jinja) supports these roles:

developer / system — system instructions + tool declarations
user — user messages
model / assistant — model responses, including tool_calls
tool — tool execution results

Converting to .litertlm

Use the LiteRT-LM conversion tools to package for on-device deployment:

# The chat_template.jinja is included in this repo
python scripts/convert-to-litertlm.py \
  --model_dir kontextdev/agent-gemma \
  --output agent-gemma.litertlm

Files

model-*.safetensors — Merged model weights (bfloat16)
tokenizer_config.json — Tokenizer config with embedded chat template
chat_template.jinja — Standalone chat template file
config.json — Model architecture config
checkpoint-* — Training checkpoints (LoRA)

License

This model inherits the Gemma license from the base model.