agent-gemma / README.md
macmacmacmac's picture
Upload README.md with huggingface_hub
4f6ad08 verified
metadata
language:
  - en
license: gemma
library_name: transformers
base_model: google/gemma-3n-E2B-it
tags:
  - function-calling
  - tool-use
  - on-device
  - mobile
  - gemma
  - litertlm

Agent Gemma β€” Gemma 3n E2B Fine-Tuned for Function Calling

A fine-tuned version of google/gemma-3n-E2B-it trained for on-device function calling using Google's FunctionGemma technique.

What's Different from Stock Gemma 3n

Fixed: format_function_declaration Template Error

The stock Gemma 3n chat template uses format_function_declaration() β€” a custom Jinja2 function available in Google's Python tokenizer but not supported by LiteRT-LM's on-device template engine. This causes:

Failed to apply template: unknown function: format_function_declaration is unknown (in template:21)

This model replaces the stock template with a LiteRT-LM compatible template that uses only standard Jinja2 features (tojson filter, <start_function_declaration> / <end_function_declaration> markers). The template is embedded in both tokenizer_config.json and chat_template.jinja.

Function Calling Format

The model uses the FunctionGemma markup format:

<start_function_call>call:function_name{param:<escape>value<escape>}<end_function_call>

Tool declarations are formatted as:

<start_function_declaration>{"name": "get_weather", "parameters": {...}}<end_function_declaration>

Training Details

  • Base model: google/gemma-3n-E2B-it (5.4B parameters)
  • Method: QLoRA (rank=16, alpha=32) β€” 22.9M trainable parameters (0.42%)
  • Dataset: google/mobile-actions (8,693 training samples)
  • Training: 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4
  • Precision: bfloat16

Usage

With LiteRT-LM on Android (Kotlin)

// After converting to .litertlm format
val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm"))
engine.initialize()

val conversation = engine.createConversation(
    ConversationConfig(
        systemMessage = Message.of("You are a helpful assistant."),
        tools = listOf(MyToolSet())  // @Tool annotated class
    )
)

// No format_function_declaration error!
conversation.sendMessageAsync(Message.of("What's the weather?"))
    .collect { print(it) }

With Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma")
tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma")

messages = [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather in Tokyo?"}
]

tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))

Chat Template

The custom chat template (in tokenizer_config.json and chat_template.jinja) supports these roles:

  • developer / system β€” system instructions + tool declarations
  • user β€” user messages
  • model / assistant β€” model responses, including tool_calls
  • tool β€” tool execution results

Converting to .litertlm

Use the LiteRT-LM conversion tools to package for on-device deployment:

# The chat_template.jinja is included in this repo
python scripts/convert-to-litertlm.py \
  --model_dir kontextdev/agent-gemma \
  --output agent-gemma.litertlm

Files

  • model-*.safetensors β€” Merged model weights (bfloat16)
  • tokenizer_config.json β€” Tokenizer config with embedded chat template
  • chat_template.jinja β€” Standalone chat template file
  • config.json β€” Model architecture config
  • checkpoint-* β€” Training checkpoints (LoRA)

License

This model inherits the Gemma license from the base model.