paraschopra
/

llama-31-8b-base-all-24k-correct_MERGED_LINEAR

Text Generation

text-generation-inference

Model card Files Files and versions

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: paraschopra/llama-31-8b-base-all-24k-correct
    parameters:
      weight: 0.1
  - model: meta-llama/meta-llama-3.1-8b-instruct
    parameters:
      weight: 0.9
merge_method: linear
tokenizer_source: meta-llama/meta-llama-3.1-8b-instruct
dtype: bfloat16

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

·

Model tree for paraschopra/llama-31-8b-base-all-24k-correct_MERGED_LINEAR

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

unsloth/Meta-Llama-3.1-8B-Instruct

Finetuned

paraschopra/llama-31-8b-base-all-24k-correct

Finetuned

(2)

this model

Paper for paraschopra/llama-31-8b-base-all-24k-correct_MERGED_LINEAR

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 8