In a Training Loop 🔄

9 74 41

Ben Kelly PRO

YellowjacketGames

manacasterben

AI & ML interests

None yet

Recent Activity

replied to danielhanchen's post about 17 hours ago

You can now run MiniMax-2.5 locally! 🚀 At 230B parameters, MiniMax-2.5 is the strongest LLM under 700B params, delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for 20 tokens/s. Guide: https://unsloth.ai/docs/models/minimax-2.5 GGUF: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF

liked a model 5 days ago

PrimeIntellect/INTELLECT-3

updated a collection 6 days ago

[mixed] Chess x AI

View all activity

Organizations

replied to danielhanchen's post about 17 hours ago

How bad is the precision loss on the Q2_K_XL quants? I can run that with full GPU offload but I usually dont run lower than Q4, broadly.

liked a model 5 days ago

PrimeIntellect/INTELLECT-3

Text Generation • Updated Nov 27, 2025 • 1.03k • 208

updated 2 collections 6 days ago

[mixed] Chess x AI

Collection

Research directly related to Chess technology. • 4 items • Updated 6 days ago • 1

[data] What a Dump!

Collection

Seriously, look at the size of these dumps! They're a huge pile of data, dumped on your doorstep! • 3 items • Updated 6 days ago

liked a dataset 6 days ago

Lichess/chess-puzzles

Viewer • Updated 9 days ago • 5.75M • 628 • 23

updated a collection 6 days ago

[papers] Distillation

Collection

14 items • Updated 6 days ago • 2

updated a dataset 7 days ago

YellowjacketGames/orc-assist-icons

Viewer • Updated 7 days ago • 30 • 94

replied to ZennyKenny's post 9 days ago

all this popularity is going to my head. i'm getting a big opinion of myself as a thought leader among my army of 9

replied to ZennyKenny's post 9 days ago

glad to see all 9 of my followers remain intact and human.

liked a model 10 days ago

nvidia/nemotron-colembed-vl-8b-v2

Visual Document Retrieval • Updated 11 days ago • 2.6k • 24

updated 2 collections 10 days ago

[mixed] ORCAssist "Work's Done!"

Collection

19 items • Updated 10 days ago • 1

[papers] RAG$ to Riche$

Collection

4 items • Updated 10 days ago • 1

upvoted a paper 11 days ago

Horizon-LM: A RAM-Centric Architecture for LLM Training

Paper • 2602.04816 • Published 12 days ago • 16

updated a collection 11 days ago

[mixed] ORCAssist "Work's Done!"

Collection

19 items • Updated 10 days ago • 1

commented a paper 11 days ago

Horizon-LM: A RAM-Centric Architecture for LLM Training

Paper • 2602.04816 • Published 12 days ago • 16 •

liked a model 11 days ago

Comfy-Org/flux2-dev

Updated Jan 10 • 1.09M • 190

replied to danielhanchen's post 11 days ago

I run it on threadripper 3970x with 256gb system ram and offloading computation layers to a gtx 1660 6gb vram. Using llama.cpp with -nkvo -kvu and all MoE on CPU. With an amazing speed on 14/TpS generation speed using q8_0. I’m amazed

Leo, I am actually curating an entire collection of "Stuff that runs well on a GTX 1660 Super 6GB. I have a small cluster of them at the office that I use for light inference tasks / synthetic data, and I'm always looking for more use-cases for this exact GPU. https://huggingface.co/collections/YellowjacketGames/models-gtx-1660-super-6gb

replied to danielhanchen's post 11 days ago

14 TPS with CPU offload is remarkable. I have a 39xx threadripper with a bunch of RAM also, and i've never seen higher than like, 2 TPS, on a cpu offloaded model.

I wonder if thats just me not taking the time to customize it more in LM Studio .

replied to danielhanchen's post 11 days ago

I'm working on my orchestrator right now, and at some point I'll stop my GPUs from spamming Flux2Dev and actually try and get some coding work done on them :)

Question if i may:

I can run a higher precision quant via nvlink and 98gb shaerd memory.

I can run a lower precision quant on a single card, and run a 2nd copy of the model on the second card.

I'm leaning towards "two models, smaller quants", but if you have any specific advice, i know i have a lot to learn

Ben Kelly PRO

AI & ML interests

Recent Activity

Organizations

YellowjacketGames's activity