--- language: - en license: gemma library_name: transformers base_model: google/gemma-3n-E2B-it tags: - function-calling - tool-use - on-device - mobile - gemma - litertlm --- # Agent Gemma — Gemma 3n E2B Fine-Tuned for Function Calling A fine-tuned version of [google/gemma-3n-E2B-it](https://huggingface.co/google/gemma-3n-E2B-it) trained for on-device function calling using Google's [FunctionGemma](https://ai.google.dev/gemma/docs/functiongemma/function-calling-with-hf) technique. ## What's Different from Stock Gemma 3n ### Fixed: `format_function_declaration` Template Error The stock Gemma 3n chat template uses `format_function_declaration()` — a custom Jinja2 function available in Google's Python tokenizer but **not supported by LiteRT-LM's on-device template engine**. This causes: ``` Failed to apply template: unknown function: format_function_declaration is unknown (in template:21) ``` This model replaces the stock template with a **LiteRT-LM compatible** template that uses only standard Jinja2 features (`tojson` filter, `` / `` markers). The template is embedded in both `tokenizer_config.json` and `chat_template.jinja`. ### Function Calling Format The model uses the FunctionGemma markup format: ``` call:function_name{param:value} ``` Tool declarations are formatted as: ``` {"name": "get_weather", "parameters": {...}} ``` ## Training Details - **Base model:** google/gemma-3n-E2B-it (5.4B parameters) - **Method:** QLoRA (rank=16, alpha=32) — 22.9M trainable parameters (0.42%) - **Dataset:** [google/mobile-actions](https://huggingface.co/datasets/google/mobile-actions) (8,693 training samples) - **Training:** 500 steps, batch_size=1, max_seq_length=512, learning_rate=2e-4 - **Precision:** bfloat16 ## Usage ### With LiteRT-LM on Android (Kotlin) ```kotlin // After converting to .litertlm format val engine = Engine(EngineConfig(modelPath = "agent-gemma.litertlm")) engine.initialize() val conversation = engine.createConversation( ConversationConfig( systemMessage = Message.of("You are a helpful assistant."), tools = listOf(MyToolSet()) // @Tool annotated class ) ) // No format_function_declaration error! conversation.sendMessageAsync(Message.of("What's the weather?")) .collect { print(it) } ``` ### With Transformers (Python) ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("kontextdev/agent-gemma") tokenizer = AutoTokenizer.from_pretrained("kontextdev/agent-gemma") messages = [ {"role": "developer", "content": "You are a helpful assistant."}, {"role": "user", "content": "What's the weather in Tokyo?"} ] tools = [{"function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}] text = tokenizer.apply_chat_template(messages, tools=tools, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt") output = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(output[0])) ``` ## Chat Template The custom chat template (in `tokenizer_config.json` and `chat_template.jinja`) supports these roles: - `developer` / `system` — system instructions + tool declarations - `user` — user messages - `model` / `assistant` — model responses, including `tool_calls` - `tool` — tool execution results ## Converting to .litertlm Use the [LiteRT-LM](https://github.com/google-ai-edge/LiteRT-LM) conversion tools to package for on-device deployment: ```bash # The chat_template.jinja is included in this repo python scripts/convert-to-litertlm.py \ --model_dir kontextdev/agent-gemma \ --output agent-gemma.litertlm ``` ## Files - `model-*.safetensors` — Merged model weights (bfloat16) - `tokenizer_config.json` — Tokenizer config with embedded chat template - `chat_template.jinja` — Standalone chat template file - `config.json` — Model architecture config - `checkpoint-*` — Training checkpoints (LoRA) ## License This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.