Papers - a Avi66 Collection

Avi66 's Collections

Papers

updated Sep 1, 2025

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16, 2025 • 35
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 273
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17, 2025 • 30
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9, 2025 • 55
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31, 2025 • 124
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 152
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21, 2025 • 11
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval

Paper • 2410.13339 • Published Oct 17, 2024
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 6
PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15, 2024 • 60
Towards Optimal Learning of Language Models

Paper • 2402.17759 • Published Feb 27, 2024 • 18
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12, 2024 • 77
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning

Paper • 2301.11660 • Published Jan 27, 2023 • 1
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Paper • 2406.12824 • Published Jun 18, 2024 • 21
Scaling and evaluating sparse autoencoders

Paper • 2406.04093 • Published Jun 6, 2024 • 3
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 22
LLM Post-Training: A Deep Dive into Reasoning Large Language Models

Paper • 2502.21321 • Published Feb 28, 2025 • 1
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Paper • 2501.09620 • Published Jan 16, 2025
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 336
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Paper • 2505.02707 • Published May 5, 2025 • 85
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 205
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Paper • 2505.23009 • Published May 29, 2025 • 18
FP4 All the Way: Fully Quantized Training of LLMs

Paper • 2505.19115 • Published May 25, 2025 • 4