Running CorrSteer: Correlation-Based Steering of Language Models via Sparse Autoencoders ๐งญ Steer language model output by clicking visual layers
Sleeping Control Reinforcement Learning ๐ Explore token-level LLM steering with feature visualizations