yury-zyphra commited on
Commit
9fe44ea
·
verified ·
1 Parent(s): a098014

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -79,6 +79,7 @@ vllm serve Zyphra/ZAYA1-8B --port 8010 \
79
  --mamba-cache-dtype float32 --dtype bfloat16 \
80
  --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser zaya_xml
81
  ```
 
82
 
83
  Once the server is up, you can query a model with `curl` like in the following example:
84
  ```bash
 
79
  --mamba-cache-dtype float32 --dtype bfloat16 \
80
  --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser zaya_xml
81
  ```
82
+ For parallel deployment we recommend using DP with EP as TP for CCA is not supported in the branch above. If running on 8 GPUs, set extra flags `-dp 8 -ep` to run with DP=EP=8.
83
 
84
  Once the server is up, you can query a model with `curl` like in the following example:
85
  ```bash