Nikita Balagansky's picture

Nikita Balagansky

elephantmipt

·

https://elephantmipt.github.io

AI & ML interests

None yet

Recent Activity

authored a paper 26 days ago

Next Embedding Prediction Makes World Models Stronger

upvoted a paper 27 days ago

Next Embedding Prediction Makes World Models Stronger

liked a Space about 1 month ago

t-tech/manifolds

View all activity

Organizations

upvoted a paper 27 days ago

Next Embedding Prediction Makes World Models Stronger

Paper • 2603.02765 • Published 28 days ago • 20

upvoted a paper about 1 month ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published Feb 15 • 56

upvoted a paper 4 months ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published Dec 11, 2025 • 118

upvoted a paper 5 months ago

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 103

upvoted 3 papers 9 months ago

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs

Paper • 2410.11179 • Published Oct 15, 2024 • 2

Teach Old SAEs New Domain Tricks with Boosting

Paper • 2507.12990 • Published Jul 17, 2025 • 12

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84

upvoted a paper 10 months ago

Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

Paper • 2505.22255 • Published May 28, 2025 • 24

upvoted 4 papers about 1 year ago

Scale-wise Distillation of Diffusion Models

Paper • 2503.16397 • Published Mar 20, 2025 • 42

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 172

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13, 2025 • 37

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3, 2025 • 113

upvoted a paper over 1 year ago

Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published Oct 10, 2024 • 20

upvoted 4 papers almost 2 years ago

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Paper • 2406.08973 • Published Jun 13, 2024 • 89

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 68

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 90