1 1 50

ottomate

alfredo-ottomate

https://ottomate.io

AI & ML interests

None yet

Recent Activity

liked a Space 8 days ago

CohereLabs/Cohere-Transcribe-WebGPU

reacted to Parveshiiii's post with 🔥 8 days ago

Just did something I’ve been meaning to try for ages. In only 3 hours, on 10 billion+ tokens, I trained a custom BPE + tiktoken-style tokenizer using my new library microtok — and it hits the same token efficiency as Qwen3. Tokenizers have always felt like black magic to me. We drop them into every LLM project, but actually training one from scratch? That always seemed way too complicated. Turns out it doesn’t have to be. microtok makes the whole process stupidly simple — literally just 3 lines of code. No heavy setup, no GPU required. I built it on top of the Hugging Face tokenizers library so it stays clean, fast, and actually understandable. If you’ve ever wanted to look under the hood and build your own optimized vocabulary instead of just copying someone else’s, this is the entry point you’ve been waiting for. I wrote up the full story, threw in a ready-to-run Colab template, and dropped the trained tokenizer on Hugging Face. Blog → https://parveshiiii.github.io/blogs/microtok/ Trained tokenizer → https://huggingface.co/Parveshiiii/microtok GitHub repo → https://github.com/Parveshiiii/microtok

liked a model 8 days ago

CohereLabs/cohere-transcribe-03-2026

View all activity

Organizations

liked a Space 8 days ago

Cohere Transcribe WebGPU

⚡

Run Cohere Transcribe locally in your browser on WebGPU.

reactedto Parveshiiii's post with 🔥 8 days ago

Post

2858

Just did something I’ve been meaning to try for ages.

In only 3 hours, on 10 billion+ tokens, I trained a custom BPE + tiktoken-style tokenizer using my new library microtok — and it hits the same token efficiency as Qwen3.

Tokenizers have always felt like black magic to me. We drop them into every LLM project, but actually training one from scratch? That always seemed way too complicated.

Turns out it doesn’t have to be.

microtok makes the whole process stupidly simple — literally just 3 lines of code. No heavy setup, no GPU required. I built it on top of the Hugging Face tokenizers library so it stays clean, fast, and actually understandable.

If you’ve ever wanted to look under the hood and build your own optimized vocabulary instead of just copying someone else’s, this is the entry point you’ve been waiting for.

I wrote up the full story, threw in a ready-to-run Colab template, and dropped the trained tokenizer on Hugging Face.

Blog → https://parveshiiii.github.io/blogs/microtok/
Trained tokenizer → Parveshiiii/microtok
GitHub repo → https://github.com/Parveshiiii/microtok

liked a model 8 days ago

CohereLabs/cohere-transcribe-03-2026

Automatic Speech Recognition • Updated 2 days ago • 96.6k • 778

repliedto omarkamali's post 9 days ago

I mean more that I would expect the ASR-side of things to worry only about transcription, rather than context, because then one could use an SLM/LLM to actually fix the transcription. I think that ASR models today want to do too much, when they should just do one thing. If I say "I read a book" and the ASR picks up "I red a book" it should be fine, because one could always post-process the output. But today, ASR models do all-in-one, which means that it is even harder to steer them (i.e. wake word detection or out-of-vocabulary terms), so in the end you're in a spot where these ASR models have a lot of overhead, but you still need to post-process the output, when instead they should just be dumb and leave the post-processing to better suited models.

repliedto omarkamali's post 9 days ago

Everything has shifted from fast word recognition to monolithic LLM-based context recognition. With IPA, ASR/STT models could focus solely on words and leave post-processing to other, more capable models. There hasn't really been a good, small ASR model that is truly capable of running locally on low-powered devices. I'm still using Vosk models because they are just good for what they are, but they're approaching the 7-year mark now, which is absurd.

repliedto omarkamali's post 9 days ago

Regarding the video, at first I thought it was a joke because they looked like tokenized words haha

The 10% speed and VRAM usage improvements sound absolutely revolutionary. It would really be a massive breakthrough if you pull it off.

Also, I commented on your post on Twitter, but I'll say it here too: this would work absolutely wonders for speech-to-text and text-to-speech since it also has baked in IPA phonemes. You should definitely consider exploring that angle, because those spaces desperately need improvement.

repliedto omarkamali's post 9 days ago

You just killed 23 dyslexic people (and counting) with that video, be ca use of the we ird wo rd split ting. hahaha

Jokes aside, this looks absolutely amazing, but I think tokenizers are there because this might not work fast enough at scale. I'd be excited and extremely happy to be proven wrong, because the concept is certainly great.

reactedto omarkamali's post with 🤯 9 days ago

Post

1858

I just might have cracked tokenizer-free LLMs. No vocab, no softmax.

I'm training a 22M params LLM rn to test this "thing" and it's able to formulate coherent sentences 🤯

Bear in mind, this is a completely new, tokenizer-free LLM architecture with built-in language universality.

Check the explainer video to understand what's happening. Feedback welcome on this approach!

14 replies

liked a model 10 days ago

OrionLLM/GRM2-3b

4B • Updated 1 day ago • 482 • 24

liked a model 15 days ago

ademax/trocr-small

Image-Text-to-Text • 54.5M • Updated Nov 17, 2023 • 5 • 1

liked a model 16 days ago

mradermacher/Crow-9B-Opus-4.6-Distill-Heretic_Qwen3.5-GGUF

9B • Updated 16 days ago • 203k • 30

liked a model 17 days ago

minishlab/potion-base-2M

Updated 8 days ago • 21.9k • 17

updated a model 17 days ago

ottomate/pocket-tts-ONNX

Text-to-Speech • Updated 17 days ago • 2

liked a model 17 days ago

neurlang/en-whipstr-base-48khz-libritts-r

Automatic Speech Recognition • Updated 17 days ago • 3

reactedto danielhanchen's post with 🤯❤️🔥 17 days ago

Post

3329

Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
• Self-healing tool calling and code execution
• Compare models side by side + export to GGUF

GitHub: https://github.com/unslothai/unsloth
Blog and Guide: https://unsloth.ai/docs/new/studio

Available now on Hugging Face, NVIDIA, Docker and Colab.

liked a Space 24 days ago

OBLITERATUS

💥

272

One-click model liberation + chat playground

commentedon 🏟️ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do 24 days ago

Just saying that GPT OSS 20B (~14 GB quantized) can fit on 1.5 GB RAM on a raspberry pi and run at 71 t/s tells me that this AI slop was probably produced by a constrained GPT OSS 20B model hahaha

liked a model 29 days ago

YatharthS/LuxTTS

Text-to-Speech • Updated Jan 23 • 8.06k • 183

ottomate

AI & ML interests

Recent Activity

Organizations

alfredo-ottomate's activity

Cohere Transcribe WebGPU

OBLITERATUS