euclaise

https://euclaise.xyz

euclaise

AI & ML interests

None yet

Recent Activity

liked a model about 3 hours ago

google/gemma-4-31B

liked a model about 3 hours ago

google/gemma-4-E4B-it

liked a model about 3 hours ago

google/gemma-4-31B-it

View all activity

Organizations

upvoted 6 papers 15 days ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 190

Attention Residuals

Paper • 2603.15031 • Published 18 days ago • 171

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published Feb 10, 2025 • 22

upvoted 2 papers 17 days ago

RAT: Bridging RNN Efficiency and Attention Accuracy in Language Modeling

Paper • 2507.04416 • Published Jul 6, 2025 • 1

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

Paper • 2602.18196 • Published Feb 20 • 1

upvoted 2 papers 19 days ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published 24 days ago • 57

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

Paper • 2603.10145 • Published 23 days ago • 11

upvoted 6 papers about 1 month ago

Online Vector Quantized Attention

Paper • 2602.03922 • Published Feb 3 • 1

Softmax Linear Attention: Reclaiming Global Competition

Paper • 2602.01744 • Published Feb 2 • 1

Test-Time Training with KV Binding Is Secretly Linear Attention

Paper • 2602.21204 • Published Feb 24 • 30

On the "Induction Bias" in Sequence Models

Paper • 2602.18333 • Published Feb 20 • 4

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published Feb 24 • 6

One-step Language Modeling via Continuous Denoising

Paper • 2602.16813 • Published Feb 18 • 4

upvoted an article about 1 month ago

Article

Differential Transformer V2

Jan 20

•

upvoted 3 papers about 1 month ago

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy

Paper • 2602.17363 • Published Feb 19 • 8

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published Feb 13 • 35

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Paper • 2602.15322 • Published Feb 17 • 10

euclaise

AI & ML interests

Recent Activity

Organizations

euclaise's activity

Differential Transformer V2