Open to Work

5 6

D B PRO

d-s-b

AI & ML interests

Exploring

Recent Activity

upvoted an article 23 days ago

Optimization story: Bloom inference

liked a model about 1 month ago

mistralai/Voxtral-Mini-4B-Realtime-2602

upvoted an article 3 months ago

KV Cache from scratch in nanoVLM

View all activity

Organizations

upvoted an article 23 days ago

Article

Optimization story: Bloom inference

Oct 12, 2022

•

liked a model about 1 month ago

mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated 4 days ago • 445k • 684

upvoted 4 articles 3 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4, 2025

•

112

Article

Mastering Tensor Dimensions in Transformers

Jan 12, 2025

•

143

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

238

Article

Continuous batching from first principles

Nov 25, 2025

•

341

updated a model 3 months ago

d-s-b/Qwen-3-0.6-medical

Updated Nov 25, 2025

published a model 3 months ago

d-s-b/Qwen-3-0.6-medical

Updated Nov 25, 2025

liked 3 Spaces 4 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.3k

Generate a curated web‑text dataset for LLM training

The Ultra-Scale Playbook

🌌

3.73k

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

📚

3.03k

The secrets to building world-class LLMs

updated a model 4 months ago

d-s-b/gemma-270m-gsm8k

Text Generation • 0.3B • Updated Oct 30, 2025 • 1

published a model 4 months ago

d-s-b/gemma-270m-gsm8k

Text Generation • 0.3B • Updated Oct 30, 2025 • 1

updated a model 6 months ago

d-s-b/meme

Updated Aug 30, 2025

liked a model 6 months ago

Qwen/Qwen-Image-Edit

Image-to-Image • Updated Aug 25, 2025 • 80.2k • • 2.32k

published a model 6 months ago

d-s-b/meme

Updated Aug 30, 2025

updated a dataset 6 months ago

d-s-b/MemeDataset

Viewer • Updated Aug 30, 2025 • 300 • 6

published a dataset 6 months ago

d-s-b/MemeDataset

Viewer • Updated Aug 30, 2025 • 300 • 6

updated a model 7 months ago

d-s-b/Qwen_Imagine

Updated Aug 21, 2025

updated a dataset 7 months ago

d-s-b/Children-QA-dataset

Viewer • Updated Aug 21, 2025 • 499 • 6

D B PRO

AI & ML interests

Recent Activity

Organizations

d-s-b's activity

Optimization story: Bloom inference

KV Cache from scratch in nanoVLM

Mastering Tensor Dimensions in Transformers

KV Caching Explained: Optimizing Transformer Inference Efficiency

Continuous batching from first principles

FineWeb: decanting the web for the finest text data at scale

The Ultra-Scale Playbook

The Smol Training Playbook