RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published Feb 2 • 36
RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning Paper • 2405.19548 • Published May 29, 2024 • 1
Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic Paper • 2601.21972 • Published Jan 29 • 1
SAFE: Stable Alignment Finetuning with Entropy-Aware Predictive Control for RLHF Paper • 2602.04651 • Published Feb 4 • 1