diffusion_model_quanto_Fp16_&_int8.safetensors
As the title suggests, if possible, could you please provide a version for older graphics card architectures?
Is a mixed precision FP16 and INT8 feasible?
ltx-2.3-22b-distilled-1.1_diffusion_model_quanto_ "Fp16_int8" .safetensors
It should work already with older cards as the bf16 is converted to fp16 when needed. Do you have any issue ?
Regarding IN8, it supports older card lines like Volta, but it may not yet support the BF16 format. Therefore, if it's possible to convert BF16 to FP16, it could fully utilize the Tensor FP16 Mixed Precision 250 TFLOPS and FP16 (Half Precision) 125 TFLOPS performance across architectures and speeds. For example, the weights:
-transformer_blocks.0.audio_ff.net.0.proj.bias[8192] BF16
- ................................attn.to_k.weight._scale[2048, 1] BF16
In my opinion, BF16 won't be native for all older Nvidia architectures. It might take some time to convert.