PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 15 days ago • 117
athrael-soju/colqwen3.5-4.5B-v1 Visual Document Retrieval • 5B • Updated 20 days ago • 524 • 14
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published Feb 12 • 57
jina-embeddings-v5-text: Task-Targeted Embedding Distillation Paper • 2602.15547 • Published Feb 17 • 26
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published Feb 2 • 96
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning Paper • 2511.18659 • Published Nov 24, 2025 • 25
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published Jan 26 • 35
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family Jan 19 • 89
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 26
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 156