3D Aware Region Prompted Vision Language Model Paper ⢠2509.13317 ⢠Published Sep 16, 2025 ⢠14
ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory Paper ⢠2509.04439 ⢠Published Sep 4, 2025 ⢠1
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper ⢠2510.12872 ⢠Published Oct 14, 2025 ⢠4
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper ⢠2510.15870 ⢠Published Oct 17, 2025 ⢠91
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail Paper ⢠2511.00088 ⢠Published Oct 30, 2025 ⢠4
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference Paper ⢠2510.17777 ⢠Published Oct 20, 2025 ⢠1
NVILA Collection NVILA: Efficient Frontier Visual Language Models ⢠12 items ⢠Updated 1 day ago ⢠17
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model Paper ⢠2310.15110 ⢠Published Oct 23, 2023 ⢠3
Condition-Aware Neural Network for Controlled Image Generation Paper ⢠2404.01143 ⢠Published Apr 1, 2024 ⢠13
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Paper ⢠2409.04429 ⢠Published Sep 6, 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Paper ⢠2410.10812 ⢠Published Oct 14, 2024 ⢠18
NVILA: Efficient Frontier Visual Language Models Paper ⢠2412.04468 ⢠Published Dec 5, 2024 ⢠60
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation Paper ⢠2507.01957 ⢠Published Jul 2, 2025 ⢠23
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models Paper ⢠2503.22020 ⢠Published Mar 27, 2025
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer Paper ⢠2507.04947 ⢠Published Jul 7, 2025 ⢠1
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference Paper ⢠2512.01031 ⢠Published Nov 30, 2025 ⢠26
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper ⢠2602.02958 ⢠Published Feb 3 ⢠34