llama.cpp-prismml / ggml /src /ggml-cpu
3.35 MB
OpenTransformer's picture
perf: optimized AVX2 kernel + COM6-inspired matmul dispatch (0.2 -> 3.43 t/s)
8f4b822 verified