Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5-35B-A3B, 27B, 122B-A10B and 397B-A17B. • 20 items • Updated about 8 hours ago • 39
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 10 days ago • 471
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 17 days ago • 53
CASA Collection CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated Dec 23, 2025 • 7
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 91
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28, 2025 • 118
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24, 2025 • 42
SageAttention2++: A More Efficient Implementation of SageAttention2 Paper • 2505.21136 • Published May 27, 2025 • 45
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing Paper • 2504.07964 • Published Apr 10, 2025 • 62
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11, 2025 • 130