File size: 2,181 Bytes
44aaef5
 
 
 
 
 
 
 
 
 
 
 
c7d2034
44aaef5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: other
tags:
  - wheels
  - cuda
  - pytorch
  - windows
  - linux
---

# image-server-wheels

Prebuilt Python 3.11 wheels.

## Contents

| File | OS | CUDA | Torch | Source | Notes |
|---|---|---|---|---|---|
| `ace_step-1.6.0-py3-none-any.whl` | any | — | — | built by us | Pure-Python, cross-platform |
| `block_sparse_attn-0.0.2-cp311-cp311-win_amd64.whl` | Windows x64 | 12.8 | 2.8 | built by us | Used by video pipeline |
| `q8_kernels-0.0.5-cp311-cp311-win_amd64.whl` | Windows x64 | 12.8 | 2.8 | built by us | Used by LTX video |
| `flash_attn-2.8.2+cu128torch2.8-cp311-cp311-win_amd64.whl` | Windows x64 | 12.8 | 2.8 | [mjun0812/flash-attention-prebuild-wheels](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.10) | Mirror of upstream release |
| `flash_attn-2.8.3+cu130torch2.10-cp311-cp311-win_amd64.whl` | Windows x64 | 13.0 | 2.10 | [mjun0812/flash-attention-prebuild-wheels](https://github.com/mjun0812/flash-attention-prebuild-wheels) | Mirror of upstream release |
| `flash_attn-2.8.3+cu128torch2.8-cp311-cp311-linux_x86_64.whl` | Linux x86_64 | 12.8 | 2.8 | [mjun0812/flash-attention-prebuild-wheels](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.16) | Mirror of upstream release |

## Direct install

```bash
BASE=https://huggingface.co/deAPI-ai/image-server-wheels/resolve/main

# Windows
pip install $BASE/q8_kernels-0.0.5-cp311-cp311-win_amd64.whl
pip install $BASE/block_sparse_attn-0.0.2-cp311-cp311-win_amd64.whl
pip install $BASE/flash_attn-2.8.2+cu128torch2.8-cp311-cp311-win_amd64.whl
pip install --no-deps $BASE/ace_step-1.6.0-py3-none-any.whl

# Linux
pip install --no-deps $BASE/flash_attn-2.8.3+cu128torch2.8-cp311-cp311-linux_x86_64.whl
```

## Credits

`flash_attn` wheels are mirrored from [mjun0812/flash-attention-prebuild-wheels](https://github.com/mjun0812/flash-attention-prebuild-wheels) — all credit for those builds goes to the upstream author. We mirror them here so the install scripts have a single source of truth and do not break if upstream release URLs change.

The remaining wheels (`ace_step`, `block_sparse_attn`, `q8_kernels`) were built in-house.