Quantization command

by utarn - opened Sep 29, 2025

Discussion

utarn

Sep 29, 2025

Would you mind sharing and guide me how to quantize this model to nvfp4?

shanjiaz

Owner Sep 29, 2025

This is quantized using nvidia modelopt:
python hf_ptq.py --pyt_ckpt_path <gpt-oss-120b> --qformat nvfp4 --export_path <gpt-oss-120b-nvfp4> --trust_remote_code

with the latest modelopt main branch https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/llm_ptq

islameissa

Dec 16, 2025

Would you please share some performance data.
Can you still convert the model to GGUF?
Would the size bloat to 240GB again if you convert to GGUF?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment