BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper • 2402.10631 • Published • 2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 5.47 | 76.30 ± 0.87 | 43.34 ± 1.45 | 77.97 ± 0.97 | 69.22 ± 1.30 | 57.11 ± 0.49 | - | 64.79 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1