EdgeN Collection Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16. โข 2 items โข Updated about 6 hours ago โข 1
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. โข 4 items โข Updated about 6 hours ago โข 1
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. โข 15 items โข Updated about 6 hours ago โข 1
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. โข 15 items โข Updated about 6 hours ago โข 1
EdgeN Collection Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16. โข 2 items โข Updated about 6 hours ago โข 1