PaddleOCR-VL-1.5 Collection Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing • 7 items • Updated Mar 6 • 18
PaddleOCR-VL Collection Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model • 5 items • Updated Feb 11 • 30
Gemma 4 Collection Gemma 4 is Google's new model family including including E2B, E4B, 26B-A4B, and 31B. • 28 items • Updated 5 days ago • 151
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 54 items • Updated 5 days ago • 114
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published 27 days ago • 62
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 793
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 513
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 513
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 • 1.19k
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 495