gemma-3-270M-Swahili-llm

Fine-tuned Gemma3-270M model specifically adapted for Swahili language instruction-following and conversation tasks.

Model Description

This model is a fine-tuned version of google/gemma-3-270m-it on ~67,000 Swahili instruction-response pairs. The fine-tuning was performed using LoRA (Low-Rank Adaptation) for parameter-efficient training, making it memory-efficient and faster.

Model Size: 270M parameters
Language: Swahili
Task: Instruction-following and conversation

Training Details

Training Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 128
Max Sequence Length: 2048
Batch Size: 4 per device
Learning Rate: 5e-5
Optimizer: AdamW 8-bit
Dataset: Swahili Instructions Dataset by alfaxadeyembe (~67,000 pairs)

This model was trained 2x faster with Unsloth.

Usage

Using Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "ngusadeep/gemma-3-270M-Swahili-llm"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

messages = [{"role": "user", "content": "Eleza nini maana ya uongozi."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=1.0,
    top_p=0.95,
    top_k=64,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Using Unsloth (Recommended)

from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from transformers import TextStreamer

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="ngusadeep/gemma-3-270M-Swahili-llm",
    max_seq_length=2048,
    load_in_4bit=False,
)

tokenizer = get_chat_template(tokenizer, chat_template="gemma3")

messages = [{"role": "user", "content": "Eleza nini maana ya uongozi."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True).removeprefix('<bos>')

_ = model.generate(
    **tokenizer(text, return_tensors="pt").to("cuda"),
    max_new_tokens=256,
    temperature=1.0,
    top_p=0.95,
    top_k=64,
    streamer=TextStreamer(tokenizer, skip_prompt=True),
)

Example Swahili Prompts

"Eleza nini maana ya uongozi."  # Explanation
"Tunga hadithi fupi kuhusu safari."  # Creative writing
"Ni nini tofauti kati ya mchana na usiku?"  # Q&A
"Andika sentensi tano kuhusu elimu."  # Instruction following

Recommended Generation Parameters

temperature: 1.0
top_p: 0.95
top_k: 64

GGUF Format

This model has been converted to GGUF format for use with llama.cpp.

Available GGUF Files

gemma-3-270m-it.Q8_0.gguf - 8-bit quantization

Using with llama.cpp

For text-only LLMs:

./llama.cpp/llama-cli -hf ngusadeep/gemma-3-270M-Swahili-llm --jinja

For multimodal models:

./llama.cpp/llama-mtmd-cli -hf ngusadeep/gemma-3-270M-Swahili-llm --jinja

Ollama

An Ollama Modelfile is included for easy deployment.

Note: The model's BOS token behavior was adjusted for GGUF compatibility.

Model Capabilities

After fine-tuning, the model demonstrates improved capability to:

Understand Swahili instructions
Generate appropriate responses in Swahili
Follow conversational patterns
Handle various instruction types (explanations, creative writing, Q&A, etc.)

Limitations

The model is fine-tuned on Swahili instruction-following tasks and may not perform as well on other languages or tasks
As a 270M parameter model, it has limitations in complex reasoning tasks
The model may occasionally generate responses that are not factually accurate

Dataset Citation

The model was fine-tuned on the Swahili Instructions dataset:

@misc{swahili-instructions-dataset,
  title={Swahili Instructions Dataset},
  author={alfaxadeyembe},
  year={2024},
  publisher={Kaggle},
  howpublished={\url{https://www.kaggle.com/datasets/alfaxadeyembe/swahili-instructions}}
}

Model Citation

When you use this model remember to cite like this:

@misc{gemma3-270m-swahili-llm,
  title={Gemma3-270M Swahili Fine-tuned Model},
  author={ngusadeep},
  year={2024},
  howpublished={\url{https://huggingface.co/ngusadeep/gemma-3-270M-Swahili-llm}}
}

Acknowledgments

Google's Gemma3 model
Unsloth team for the efficient fine-tuning framework
alfaxadeyembe for creating and sharing the Swahili Instructions Dataset on Kaggle

License

This model is licensed under Apache 2.0. See the LICENSE file for details.

Downloads last month: 35

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for ngusadeep/gemma-3-270M-Swahili-llm

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it

Adapter

(57)

this model