Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14
This is a merge of pre-trained language models created using mergekitty.
Use Llama-3 for instruct, don't use ChatML since that introduces refusals way more.
This model was merged using the Model Stock merge method using deepcogito/cogito-v2-preview-llama-70B as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
# Part 1
- model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
# Part 2
- model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt2
merge_method: model_stock
base_model: deepcogito/cogito-v2-preview-llama-70B
parameters:
normalize: true
dtype: bfloat16