Very good quant

Feb 21

llama.cpp b8121
Devstral-Small-2-24B-Instruct-2512-IQ4_XS-4.04bpw.gguf
RTX 5090

I do have to give a very good feedback. Have been using LM Studio's Q8_0 as daily driver. This quant performs identically, although less verbose in docs.

I've got forks from my own production environments where I test these quants, and yours completes practically the same as the Q8_0, albeit faster and larger context window for same settings.

Have yet to find major flaws. Until then, great work!

Ali93H

ByteShape org Feb 21

Thank you for the thoughtful feedback and kind words. It means a lot.
If you run into any edge cases or weaknesses, we’d really appreciate hearing about them. That kind of feedback helps us refine our datasets, datatype selection, and benchmarks for future releases.