Yedidia AGNIMO

YedsonUQ

AI & ML interests

[Uncertainty Quantification, "Hallucinations"] in LLMs, Federated Learning

Recent Activity

upvoted a paper about 1 month ago

Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling

upvoted a paper about 1 month ago

Reinforcement World Model Learning for LLM-based Agents

upvoted a paper about 1 month ago

Reinforced Attention Learning

View all activity

Organizations

None yet

upvoted 9 papers about 1 month ago

Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling

Paper • 2601.22636 • Published Jan 30 • 22

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

Paper • 2601.22027 • Published Jan 29 • 83

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Paper • 2512.24271 • Published Dec 30, 2025 • 63

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

Paper • 2601.17058 • Published Jan 22 • 190

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published Jan 28 • 120

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 201

updated a dataset about 1 month ago

YedsonUQ/ragtruth-unc

Viewer • Updated Feb 3 • 35.5k • 54 • 1

updated a collection 2 months ago

Theory, Conceptualization, Paradigms

Collection

7 items • Updated Jan 7 • 1

upvoted 2 papers 2 months ago

When Reasoning Meets Its Laws

Paper • 2512.17901 • Published Dec 19, 2025 • 61

Deep Research: A Systematic Survey

Paper • 2512.02038 • Published Nov 24, 2025 • 72

updated a collection 2 months ago

Survey

Collection

3 items • Updated Jan 7

upvoted a paper 2 months ago

SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence

Paper • 2512.22334 • Published Dec 26, 2025 • 36

updated a collection 2 months ago

Benchmark and Evaluation

Collection

13 items • Updated Jan 7

upvoted a paper 2 months ago

COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs

Paper • 2601.01836 • Published Jan 5 • 10

updated 2 collections 2 months ago

Theory, Conceptualization, Paradigms

Collection

7 items • Updated Jan 7 • 1

Reinforcement Learning (RL)

Collection

5 items • Updated Jan 7

Yedidia AGNIMO

AI & ML interests

Recent Activity

Organizations

YedsonUQ's activity