Jonna Matthiesen

JonnaMat

None yet

updated a collection 6 days ago

updated a collection 6 days ago

updated a model 14 days ago

Post

117

⚡ Qwen3.5, up to 1.4× faster. Same quality. Less latency.

We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.

📊 embedl/Edge-Inference-Benchmarks

🤗 https://huggingface.co/collections/embedl/qwen35

Article

None public yet

None public yet