Running 164 Qwen2.5 VL 32B Instruct Demo 🏃 164 Chat with a multimodal AI using text, images, or video
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9, 2025 • 126
MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces Paper • 2510.08783 • Published Oct 9, 2025 • 5