vLLM V1 support

by vcerny - opened Dec 1, 2025

Dec 1, 2025

Hello, is this supposed to work with vLLM v0.11.2? I am getting error:

[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842] torch.AcceleratorError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842] Search for `cudaErrorUnsupportedPtxVersion' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[port-8003] (EngineCore_DP0 pid=573) ERROR 12-01 05:27:02 [core.py:842]

joennlae

Dec 22, 2025

@vcerny https://docs.vllm.ai/en/stable/usage/troubleshooting/?h=provided+ptx#cuda-error-the-provided-ptx-was-compiled-with-an-unsupported-toolchain

See here. The issue is that your driver is outdated compared to the build env used for vllm.

vcerny

Dec 22, 2025

Indeed. Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment