From RAG to Agentic RAG for Faithful Islamic Question Answering Paper • 2601.07528 • Published Jan 12 • 1
Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics Paper • 2601.04946 • Published Jan 8
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper • 2510.06107 • Published Oct 7, 2025 • 3
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Paper • 2506.13458 • Published Jun 16, 2025
Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning Paper • 2505.16088 • Published May 22, 2025 • 3
view post Post 3661 Having some fun with long context benchmarks (watch the video!!) NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87Michalenglo: https://deepmind.google/research/publications/117639/LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/ let me know if you like these posts See translation 👍 4 4 + Reply
Local Self-Attention over Long Text for Efficient Document Retrieval Paper • 2005.04908 • Published May 11, 2020
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation Paper • 2010.02666 • Published Oct 6, 2020
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling Paper • 2104.06967 • Published Apr 14, 2021
Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking Paper • 2105.09816 • Published May 20, 2021
Establishing Strong Baselines for TripClick Health Retrieval Paper • 2201.00365 • Published Jan 2, 2022
Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction Paper • 2203.13088 • Published Mar 24, 2022
FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation Paper • 2209.14290 • Published Sep 28, 2022
Pap2Pat: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs Paper • 2410.07009 • Published Oct 9, 2024 • 1
view post Post 1078 🔀 Very cool demo of word-level alignment of paraphrased or cross-lingual sentences, from the new Fairly Multilingual ModernBERT embedding model: Parallia/Fairly-Multilingual-ModernBERT-Token-Alignment See translation 👍 2 2 + Reply
DateLogicQA: Benchmarking Temporal Biases in Large Language Models Paper • 2412.13377 • Published Dec 17, 2024 • 3
Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks Paper • 2411.01192 • Published Nov 2, 2024 • 5
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic Paper • 2407.18129 • Published Jul 25, 2024 • 12
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition Paper • 2407.13559 • Published Jul 18, 2024 • 23