gn00029914

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

upvoted a paper 4 days ago

Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

upvoted a paper 4 days ago

WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects

View all activity

Organizations

upvoted 14 papers 4 days ago

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

Paper • 2304.06364 • Published Apr 13, 2023 • 3

Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

Paper • 2409.12640 • Published Sep 19, 2024 • 3

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 16

ForecastPFN: Synthetically-Trained Zero-Shot Forecasting

Paper • 2311.01933 • Published Nov 3, 2023 • 1

Chronos: Learning the Language of Time Series

Paper • 2403.07815 • Published Mar 12, 2024 • 48

Chronos-2: From Univariate to Universal Forecasting

Paper • 2510.15821 • Published Oct 17, 2025 • 22

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10, 2025 • 11

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Paper • 2001.04063 • Published Jan 13, 2020 • 1

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21, 2025 • 67

Gated Delta Networks: Improving Mamba2 with Delta Rule

Paper • 2412.06464 • Published Dec 9, 2024 • 15

upvoted a paper 6 days ago

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published 10 days ago • 59

upvoted a collection 6 days ago

Qwen3-Coder-Next

Collection

4 items • Updated 9 days ago • 74

upvoted a paper 6 days ago

Measuring AI Ability to Complete Long Tasks

Paper • 2503.14499 • Published Mar 18, 2025 • 16

upvoted 2 papers 7 days ago

RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge

Paper • 2311.08147 • Published Nov 14, 2023 • 1

zELO: ELO-inspired Training Method for Rerankers and Embedding Models

Paper • 2509.12541 • Published Sep 16, 2025 • 6

upvoted a paper 8 days ago

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

Paper • 2601.08620 • Published about 1 month ago • 11

gn00029914

AI & ML interests

Recent Activity

Organizations

gn00029914's activity