llama.cpp-prismml / examples /model-conversion /scripts /utils /curl-embedding-server.sh
OpenTransformer's picture
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp)
03ba2cd verified
#!/usr/bin/env bash
curl --request POST \
--url http://localhost:8080/embedding \
--header "Content-Type: application/json" \
--data '{"input": "Hello world today"}' \
--silent