TRL documentation
Chat template utilities
Chat template utilities
clone_chat_template
trl.clone_chat_template
< source >( model: PreTrainedModel tokenizer: PythonBackend source_tokenizer_path: str resize_to_multiple_of: int | None = 64 ) → model (PreTrainedModel)
Parameters
- model (PreTrainedModel) — Model to update.
- tokenizer (
PreTrainedTokenizer) — Tokenizer to update. - source_tokenizer_path (
str) — Path or identifier of the pretrained tokenizer to clone from. - resize_to_multiple_of (
intorNone, optional, defaults to64) — The embedding layer will be resized to the new vocabulary size. If this is notNone, it will round up the new vocabulary size to the nearest multiple of this value.
Returns
model (PreTrainedModel)
Updated model with resized token embeddings and EOS token configured.
tokenizer (PreTrainedTokenizer):
Updated tokenizer with the chat template and special tokens applied.
added_tokens (list[int]):
List of tokens that were added to the tokenizer from the source tokenizer.
Clones a chat template from a source tokenizer to the target tokenizer and updates the model accordingly.
This function:
- Copies the chat template from a source tokenizer to the target tokenizer.
- Adds any new tokens from the source tokenizer to the target tokenizer.
- Sets and synchronizes the EOS token across the tokenizer and model.
- Resizes the model’s token embeddings to match the new vocabulary size, optionally rounding it up to a multiple of a specified value. In such cases, dummy tokens are added to the tokenizer to ensure the vocabulary size matches the embedding dimensions.
Example:
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import clone_chat_template
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
model, tokenizer, added_tokens = clone_chat_template(model, tokenizer, "Qwen/Qwen3-0.6B")is_chat_template_prefix_preserving
trl.chat_template_utils.is_chat_template_prefix_preserving
< source >( tokenizer: PythonBackend ) → bool
Check whether the chat template preserves prefixes when applied.
get_training_chat_template
trl.get_training_chat_template
< source >( tokenizer: PythonBackend ) → str or None
Get a prefix-preserving chat template for training, if needed.
If the tokenizer’s template isn’t prefix-preserving, returns a training-compatible template (currently only Qwen3
supported). Otherwise, returns None.
Example:
>>> from trl.chat_template_utils import get_training_chat_template
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
>>> messages1 = [
... {"role": "user", "content": "What color is the sky?"},
... {"role": "assistant", "content": "It is blue."},
... ]
>>> messages2 = [
... {"role": "user", "content": "What color is the sky?"},
... {"role": "assistant", "content": "It is blue."},
... {"role": "user", "content": "And at night?"},
... ]
>>> tokenizer.apply_chat_template(messages1, tokenize=False)
'<|im_start|>user\nWhat color is the sky?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\nIt is blue.<|im_end|>\n'
>>> tokenizer.apply_chat_template(messages2, tokenize=False)
'<|im_start|>user\nWhat color is the sky?<|im_end|>\n<|im_start|>assistant\nIt is blue.<|im_end|>\n<|im_start|>user\nAnd at night?<|im_end|>\n'
>>> # ^ think tags missing
>>> chat_template = get_training_chat_template(tokenizer)
>>> tokenizer.apply_chat_template(messages1, tokenize=False, chat_template=chat_template)
'<|im_start|>user\nWhat color is the sky?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\nIt is blue.<|im_end|>\n'
>>> tokenizer.apply_chat_template(messages2, tokenize=False, chat_template=chat_template)
'<|im_start|>user\nWhat color is the sky?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\nIt is blue.<|im_end|>\n<|im_start|>user\nAnd at night?<|im_end|>\n'