This is a collection of Llama and Qwen-based models ranging from 1.5B to 70B parameters with are distilled from DeepSeek's new R1 models.
-
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Text Generation β’ Updated β’ 2.08M β’ β’ 855 -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
Text Generation β’ Updated β’ 184k β’ β’ 771 -
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Text Generation β’ 2B β’ Updated β’ 677k β’ β’ 1.49k -
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
Text Generation β’ 8B β’ Updated β’ 586k β’ β’ 826