inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.5-bits 7B • Updated about 1 month ago • 1
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.25-bits 6B • Updated about 1 month ago • 4
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.0-bits 6B • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.75-bits 6B • Updated about 1 month ago • 3
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.5-bits 6B • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.25-bits 6B • Updated about 1 month ago • 1
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.0-bits 5B • Updated about 1 month ago • 3
Mixed Precision Models Collection Collection of Mixed Precision LLaMA and Qwen Models • 7 items • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.5-bits 7B • Updated about 1 month ago • 1
Mixed Precision Models Collection Collection of Mixed Precision LLaMA and Qwen Models • 7 items • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.25-bits 6B • Updated about 1 month ago • 4
Mixed Precision Models Collection Collection of Mixed Precision LLaMA and Qwen Models • 7 items • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.0-bits 6B • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.75-bits 6B • Updated about 1 month ago • 3
Mixed Precision Models Collection Collection of Mixed Precision LLaMA and Qwen Models • 7 items • Updated about 1 month ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_5.5-bits 6B • Updated about 1 month ago
Mixed Precision Models Collection Collection of Mixed Precision LLaMA and Qwen Models • 7 items • Updated about 1 month ago