Spaces-explorers

Activity Feed Request to join this org

AI & ML interests

Contributors who are invited to beta-test our next big feature! Contact us if you want to join this team :-)

Recent Activity

victor submitted a paper about 1 month ago

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

KennethEnevoldsen authored a paper about 1 month ago

MAEB: Massive Audio Embedding Benchmark

KennethEnevoldsen authored a paper 6 months ago

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

View all activity

authored 10 papers about 1 year ago

Tokenizer Choice For LLM Training: Negligible or Crucial?

Paper • 2310.08754 • Published Oct 12, 2023 • 3

Towards an Open Platform for Legal Information

Paper • 2005.13342 • Published May 27, 2020

Aspect-based Document Similarity for Research Papers

Paper • 2010.06395 • Published Oct 13, 2020

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

Paper • 2202.06671 • Published Feb 14, 2022 • 2

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

Paper • 2203.14541 • Published Mar 28, 2022

Investigating Gender Bias in Turkish Language Models

Paper • 2404.11726 • Published Apr 17, 2024 • 1

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

Paper • 2301.09626 • Published Jan 23, 2023 • 2

Progress Report: Towards European LLMs

Paper • 2410.03730 • Published Sep 30, 2024 • 3

Data Processing for the OpenGPT-X Model Family

Paper • 2410.08800 • Published Oct 11, 2024 • 1

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19, 2025 • 47

authored a paper about 2 years ago

Learn Your Tokens: Word-Pooled Tokenization for Language Modeling

Paper • 2310.11628 • Published Oct 17, 2023

authored 3 papers about 2 years ago

Multi-Lingual Malaysian Embedding: Leveraging Large Language Models for Semantic Representations

Paper • 2402.03053 • Published Feb 5, 2024 • 2

MaLLaM -- Malaysia Large Language Model

Paper • 2401.14680 • Published Jan 26, 2024 • 3

Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding

Paper • 2401.13565 • Published Jan 24, 2024 • 4

authored a paper over 2 years ago

AutoMix: Automatically Mixing Language Models

Paper • 2310.12963 • Published Oct 19, 2023 • 14

authored a paper over 2 years ago

Multitask Learning and Multistage Fusion for Dimensional Audiovisual Emotion Recognition

Paper • 2002.11312 • Published Feb 26, 2020

authored 2 papers almost 3 years ago

SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification

Paper • 2301.11309 • Published Jan 26, 2023

Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs

Paper • 2305.11860 • Published May 19, 2023

authored 2 papers almost 3 years ago

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 33

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Paper • 2202.01279 • Published Feb 2, 2022