Running Featured 41 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems π 41 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper β’ 2510.14528 β’ Published Oct 16, 2025 β’ 118
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 β’ 175
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 β’ 120
π Research & Long-Form Blog Posts Collection In-depth technical articles and research pieces published by Hugging Face β’ 11 items β’ Updated 7 days ago β’ 21
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper β’ 2512.20848 β’ Published Dec 23, 2025 β’ 38
view post Post 5547 NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! π₯Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUFπ Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3 See translation 1 reply Β· π₯ 13 13 β€οΈ 7 7 π€ 4 4 π 1 1 + Reply
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 β’ 301
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15, 2025 β’ 224
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 β’ 214
Running 3.7k The Ultra-Scale Playbook π 3.7k The ultimate guide to training LLM on large GPU Clusters
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Paper β’ 2410.10733 β’ Published Oct 14, 2024 β’ 9
view article Article Say hello to `hf`: a faster, friendlier Hugging Face CLI β¨ +1 Jul 25, 2025 β’ 84