File size: 4,274 Bytes

d7d68cb
 
 
4f6ad08
 
 
d7d68cb
da7f35e
4f6ad08
d7d68cb
4f6ad08
 
 
d7d68cb
 
4f6ad08
da7f35e
4f6ad08
da7f35e
4f6ad08
da7f35e
4f6ad08
da7f35e
4f6ad08
d7d68cb
 
4f6ad08
da7f35e
d7d68cb
4f6ad08
d7d68cb
4f6ad08
d7d68cb
4f6ad08
d7d68cb
 
4f6ad08
d7d68cb
da7f35e
4f6ad08
d7d68cb
4f6ad08
d7d68cb
 
4f6ad08
da7f35e
4f6ad08
 
 
 
 
da7f35e
4f6ad08
da7f35e
4f6ad08
da7f35e
4f6ad08
 
 
 
da7f35e
4f6ad08
 
 
 
da7f35e
4f6ad08
da7f35e
4f6ad08
 
 
da7f35e
d7d68cb
4f6ad08
d7d68cb
4f6ad08
 
d7d68cb
4f6ad08
 
d7d68cb
4f6ad08
 
 
 
d7d68cb
4f6ad08
d7d68cb
4f6ad08
 
 
 
d7d68cb
 
4f6ad08
d7d68cb
4f6ad08
 
 
 
 
d7d68cb
4f6ad08
d7d68cb
4f6ad08
d7d68cb
4f6ad08
 
 
 
 
 
d7d68cb
4f6ad08
d7d68cb
4f6ad08
 
 
 
 
d7d68cb
 
 
da7f35e

---
language:
- en
license: gemma
library_name: transformers
base_model: google/gemma-3n-E2B-it
tags:
- function-calling
- tool-use
- on-device
- mobile
- gemma
- litertlm
---

# Agent Gemma — Gemma 3n E2B Fine-Tuned for Function Calling

A fine-tuned version of [google/gemma-3n-E2B-it](https://huggingface.co/google/gemma-3n-E2B-it) trained for on-device function calling using Google's [FunctionGemma](https://ai.google.dev/gemma/docs/functiongemma/function-calling-with-hf) technique.

## What's Different from Stock Gemma 3n

### Fixed: `format_function_declaration` Template Error

The stock Gemma 3n chat template uses `format_function_declaration()` — a custom Jinja2 function available in Google's Python tokenizer but **not supported by LiteRT-LM's on-device template engine**. This causes:

```
Failed to apply template: unknown function: format_function_declaration is unknown (in template:21)
```

This model replaces the stock template with a **LiteRT-LM compatible** template that uses only standard Jinja2 features (`tojson` filter, `<start_function_declaration>` / `<end_function_declaration>` markers). The template is embedded in both `tokenizer_config.json` and `chat_template.jinja`.

### Function Calling Format

The model uses the FunctionGemma markup format:

```
<start_function_call>call:function_name{param:<escape>value<escape>}<end_function_call>
```

Tool declarations are formatted as:
```
<start_function_declaration>{"name": "get_weather", "parameters": {...}}<end_function_declaration>
```

## Training Details

- **Base model:** google/gemma-3n-E2B-it (5.4B parameters)
- **Method:** QLoRA (rank=16, alpha=32) — 22.9M trainable parameters (0.42%)
- **Dataset:** [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) (8,693 training samples)
- **Training:** 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4
- **Precision:** bfloat16

## Usage

### With LiteRT-LM on Android (Kotlin)

```kotlin
// After converting to .litertlm format
val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm"))
engine.initialize()

val conversation = engine.createConversation(
    ConversationConfig(
        systemMessage = Message.of("You are a helpful assistant."),
        tools = listOf(MyToolSet())  // @Tool annotated class
    )
)

// No format_function_declaration error!
conversation.sendMessageAsync(Message.of("What's the weather?"))
    .collect { print(it) }
```

### With Transformers (Python)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma")
tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma")

messages = [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather in Tokyo?"}
]

tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]

text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))
```

## Chat Template

The custom chat template (in `tokenizer_config.json` and `chat_template.jinja`) supports these roles:
- `developer` / `system` — system instructions + tool declarations
- `user` — user messages  
- `model` / `assistant` — model responses, including `tool_calls`
- `tool` — tool execution results

## Converting to .litertlm

Use the [LiteRT-LM](https://github.com/google-ai-edge/LiteRT-LM) conversion tools to package for on-device deployment:

```bash
# The chat_template.jinja is included in this repo
python scripts/convert-to-litertlm.py \
  --model_dir kontextdev/agent-gemma \
  --output agent-gemma.litertlm
```

## Files

- `model-*.safetensors` — Merged model weights (bfloat16)
- `tokenizer_config.json` — Tokenizer config with embedded chat template
- `chat_template.jinja` — Standalone chat template file
- `config.json` — Model architecture config
- `checkpoint-*` — Training checkpoints (LoRA)

## License

This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.