UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^{128} for Unified Multimodal Large Language Model Paper • 2602.14178 • Published 4 days ago • 11
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs Paper • 2602.10388 • Published 9 days ago • 217
DHPLT: large-scale multilingual diachronic corpora and word representations for semantic change modelling Paper • 2602.11968 • Published 7 days ago • 1
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published 7 days ago • 20
Inference-Time Hyper-Scaling with KV Cache Compression Paper • 2506.05345 • Published Jun 5, 2025 • 30
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 10 days ago • 66
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published 14 days ago • 2
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published 13 days ago • 71
Aster: Autonomous Scientific Discovery over 20x Faster Than Existing Methods Paper • 2602.07040 • Published 16 days ago • 2
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published 14 days ago • 23
Thinking Makes LLM Agents Introverted: How Mandatory Thinking Can Backfire in User-Engaged Agents Paper • 2602.07796 • Published 12 days ago • 7
compar:IA: The French Government's LLM arena to collect French-language human prompts and preference data Paper • 2602.06669 • Published 14 days ago • 7
Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing Paper • 2602.04837 • Published 15 days ago • 8
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals Paper • 2602.02581 • Published 19 days ago • 9
ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution Paper • 2602.03075 • Published 17 days ago • 6
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs Paper • 2602.05367 • Published 15 days ago • 7
Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers Paper • 2602.06079 • Published 16 days ago • 18
Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization Paper • 2601.23174 • Published 20 days ago • 3
Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning Paper • 2602.04998 • Published 15 days ago • 6