Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 63 items • Updated about 11 hours ago • 61
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28, 2025 • 48
Running Featured 578 Image Arena Leaderboard 📊 578 Image Generation and Image Editing Arena & Leaderboard
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14, 2025 • 57
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k Zero-Shot Image Classification • Updated Jan 22, 2025 • 53.2k • 306
Running 347 VBench Leaderboard 📊 347 Upload video model evaluation data to update the VBench leaderboard