Better Perplexity Alternative GGUFs
#13
by
ubergarm
- opened
I have done some perplexity benchmarks on various GGUFs and my impression is there are better mixes available than this official "Int4" Q4_K_S. Even more true if you want to use ik_llama.cpp with the newer SOTA quantization types.
https://huggingface.co/ubergarm/Step-3.5-Flash-GGUF#quant-collection
I've only released the best quants, with the unreleased used to guide the recipes:
Also keep an eye on the good mixes by @AesSedai are available as well with more PPL/KLD data research likely available soon: https://huggingface.co/AesSedai/Step-3.5-Flash-GGUF
Thanks for this nice sized model and supporting the whole ik/llama.cpp ecosystem!!! Cheers!
I tried the IQ4_XS quant from ubergarm and it feels amazing!

