Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published 16 days ago • 13 • 3
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 255k • 1.58k
bharatgenai/Param2-17B-A2.4B-Thinking Text Generation • 17B • Updated 2 days ago • 4.79k • 59