mx-test

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ybelkada authored a paper about 2 months ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

ybelkada authored a paper 7 months ago

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

ybelkada authored a paper 7 months ago

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

View all activity

ybelkada

authored a paper about 2 months ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published Jan 8 • 42

ybelkada

authored 2 papers 7 months ago

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

Paper • 2506.07731 • Published Jun 9, 2025 • 2

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 70

ArthurZ

posted an update over 1 year ago

Post

5807

Native tensor parallel has landed in transformers!!! https://github.com/huggingface/transformers/pull/34184 thanks a lot to the torch team for their support!

Contributions are welcome to support more models! 🔥

ybelkada

authored a paper over 1 year ago

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Paper • 2410.05355 • Published Oct 7, 2024 • 35

ybelkada

posted an update over 1 year ago

Post

5922

Falcon Mamba now available now in llama.cpp !
Check out GGUF files uploaded here: tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a

3 replies

ybelkada

posted an update over 1 year ago

Post

4160

FalconMamba 7B - a new model from TII (Technology Innovation Institute) is out !

- Blogpost: https://huggingface.co/blog/falconmamba
- Link to collection: tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a
- Link to playground: tiiuae/falcon-mamba-playground

ArthurZ

authored a paper almost 2 years ago

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11, 2024 • 48

ybelkada

posted an update almost 2 years ago

Post

Check out quantized weights from ISTA-DAS Lab directly in their organisation page:

ISTA-DASLab ! With official weights of AQLM (for 2bit quantization) & QMoE (1-bit MoE quantization)

Read more about these techniques below:

AQLM paper: Extreme Compression of Large Language Models via Additive Quantization (2401.06118)
QMoE: QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models (2310.16795)

Some useful links below:

AQLM repo: https://github.com/Vahe1994/AQLM
How to use AQLM & transformers: https://huggingface.co/docs/transformers/quantization#aqlm
How to use AQLM & PEFT: https://huggingface.co/docs/peft/developer_guides/quantization#aqlm-quantizaion

Great work from @BlackSamorez and team !

ArthurZ

posted an update almost 2 years ago

Post

mamba is now available in transformers. Thanks to @tridao and @albertgu for this brilliant model! 🚀 and the amazing mamba-ssm kernels powering this!
Checkout the collection here:
state-spaces/transformers-compatible-mamba-65e7b40ab87e5297e45ae406