-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 177 -
TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training
Paper • 2603.01714 • Published -
SIGHT: Reinforcement Learning with Self-Evidence and Information-Gain Diverse Branching for Search Agent
Paper • 2602.11551 • Published -
Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains?
Paper • 2510.11184 • Published • 1
Jinluan Yang
yangjinluan
AI & ML interests
Trustworthy Machine Learning
Recent Activity
updated
a collection
4 days ago
Agentic RL updated
a collection
4 days ago
Agentic RL updated
a collection
4 days ago
Agentic RL Organizations
Model Merging
-
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Paper • 2502.06876 • Published -
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Paper • 2410.13910 • Published -
yangjinluan/3H_Merging_Llama3_Harmlessness
Updated • 4 -
yangjinluan/3H_Merging_Mistral_Helpfulness_Harmlessness
7B • Updated • 2
General Reasoning & Formal Reasoning
-
Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification
Paper • 2601.22642 • Published • 9 -
Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains?
Paper • 2510.11184 • Published • 1 -
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
Paper • 2506.17104 • Published • 2 -
chuxuecao/FLV-SFT-dataset
Viewer • Updated • 14.1k • 18 • 2
Agentic RL
-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 177 -
TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training
Paper • 2603.01714 • Published -
SIGHT: Reinforcement Learning with Self-Evidence and Information-Gain Diverse Branching for Search Agent
Paper • 2602.11551 • Published -
Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains?
Paper • 2510.11184 • Published • 1
General Reasoning & Formal Reasoning
-
Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification
Paper • 2601.22642 • Published • 9 -
Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains?
Paper • 2510.11184 • Published • 1 -
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
Paper • 2506.17104 • Published • 2 -
chuxuecao/FLV-SFT-dataset
Viewer • Updated • 14.1k • 18 • 2
Model Merging
-
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Paper • 2502.06876 • Published -
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Paper • 2410.13910 • Published -
yangjinluan/3H_Merging_Llama3_Harmlessness
Updated • 4 -
yangjinluan/3H_Merging_Mistral_Helpfulness_Harmlessness
7B • Updated • 2