Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
OpenTransformer
/
llama.cpp-prismml
like
0
arxiv:
2302.13971
arxiv:
2005.14165
arxiv:
2203.02155
Model card
Files
Files and versions
xet
Community
main
llama.cpp-prismml
/
examples
/
speculative
/
README.md
OpenTransformer
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp)
03ba2cd
verified
6 days ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
Safe
282 Bytes
llama.cpp/examples/speculative
Demonstration of speculative decoding and tree-based speculative decoding techniques
More info:
https://github.com/ggml-org/llama.cpp/pull/2926
https://github.com/ggml-org/llama.cpp/pull/3624
https://github.com/ggml-org/llama.cpp/pull/5625