When Models Manipulate Manifolds: The Geometry of a Counting Task Paper • 2601.04480 • Published Jan 8 • 4
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 Text Generation • 18B • Updated about 20 hours ago • 252k • 100
mlx-community/Jan-v3-4B-base-instruct-4bit Text Generation • 0.6B • Updated 26 days ago • 348 • 2
mlx-community/Jan-v3-4B-base-instruct-8bit Text Generation • 1B • Updated 26 days ago • 179 • 3
view article Article Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models Jan 6 • 23
meituan-longcat/LongCat-Flash-Thinking-2601 Text Generation • 562B • Updated 29 days ago • 5.23k • 101
unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF Text Generation • 80B • Updated Jan 14 • 45.1k • 165
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 180
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 120
view article Article Introducing swift-huggingface: The Complete Swift Client for Hugging Face Dec 5, 2025 • 43