Update README.md
#2
by zwpride-iquestlab - opened
README.md
CHANGED
|
@@ -137,6 +137,13 @@ outputs = model.generate(
|
|
| 137 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 138 |
```
|
| 139 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 140 |
### Fill-in-the-Middle (FIM)
|
| 141 |
|
| 142 |
InCoder-32B supports FIM completion for code infilling tasks:
|
|
|
|
| 137 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 138 |
```
|
| 139 |
|
| 140 |
+
### Deployment with vLLM
|
| 141 |
+
For production deployment, you can use vLLM to create an OpenAI-compatible API endpoint.
|
| 142 |
+
|
| 143 |
+
```
|
| 144 |
+
vllm serve Multilingual-Multimodal-NLP/IndustrialCoder --tensor-parallel-size 8
|
| 145 |
+
```
|
| 146 |
+
|
| 147 |
### Fill-in-the-Middle (FIM)
|
| 148 |
|
| 149 |
InCoder-32B supports FIM completion for code infilling tasks:
|