3 11 99

Avinash Sooriyarachchi

AviSoori1x

https://www.linkedin.com/in/avi-data-ml/

AI & ML interests

I work at Mistral AI

Recent Activity

liked a dataset about 1 month ago

MuskumPillerum/General-Knowledge

upvoted an article about 2 months ago

From GRPO to DAPO and GSPO: What, Why, and How

upvoted an article about 2 months ago

Continuous batching from first principles

View all activity

Organizations

liked a dataset about 1 month ago

MuskumPillerum/General-Knowledge

Viewer • Updated Dec 7, 2025 • 37.6k • 357 • 45

upvoted 2 articles about 2 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

113

Article

Continuous batching from first principles

Nov 25, 2025

•

363

liked a model 3 months ago

mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated Mar 11 • 926k • 829

liked a dataset 3 months ago

mistralai/mmlu_speech

Viewer • Updated Jul 15, 2025 • 14.3k • 262 • 18

upvoted 2 articles 3 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

301

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

469

authored a paper 3 months ago

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 60

liked 2 models 4 months ago

microsoft/TRELLIS.2-4B

Image-to-3D • Updated Dec 27, 2025 • 775

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 523

liked a model 5 months ago

answerdotai/ModernBERT-large

Fill-Mask • Updated Jan 15, 2025 • 188k • 466

upvoted an article 7 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

May 7, 2024

•

119

liked 3 datasets 8 months ago

upvoted 2 articles 10 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

769

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

164

liked a model 10 months ago

mistralai/Magistral-Small-2506

24B • Updated Jul 28, 2025 • 48.9k • 608

upvoted 2 articles 11 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4, 2025

•

116

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

Jun 3, 2025

•

101

Avinash Sooriyarachchi

AI & ML interests

Recent Activity

Organizations

AviSoori1x's activity

From GRPO to DAPO and GSPO: What, Why, and How

Continuous batching from first principles

KV Caching Explained: Optimizing Transformer Inference Efficiency

You could have designed state of the art positional encoding

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

SmolLM3: smol, multilingual, long-context reasoner

Learn the Hugging Face Kernel Hub in 5 Minutes

KV Cache from scratch in nanoVLM

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL