2 3 7

Mickel Liu

mickelliu

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive

updated a collection 6 months ago

Self-RedTeam

updated a collection 6 months ago

Self-RedTeam

View all activity

Organizations

None yet

liked a model 2 days ago

HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive

9B • Updated 5 days ago • 849k • 1.08k

updated a collection 6 months ago

Self-RedTeam

Collection

Code: https://github.com/mickelliu/selfplay-redteaming Paper: https://arxiv.org/pdf/2506.07468 • 3 items • Updated Oct 21, 2025

published 3 models 6 months ago

updated 3 models 6 months ago

mickelliu/Self-RedTeam-Qwen2.5-7B-Instruct

8B • Updated Oct 21, 2025 • 1.9k • 3

mickelliu/Self-RedTeam-Qwen2.5-14B-Instruct

15B • Updated Oct 21, 2025 • 16

mickelliu/Self-RedTeam-Qwen2.5-3B-Instruct

3B • Updated Oct 21, 2025 • 71

liked a dataset 8 months ago

allenai/IFBench_multi-turn

Viewer • Updated Jul 3, 2025 • 3.16k • 699 • 12

liked a dataset 9 months ago

walledai/StrongREJECT

Viewer • Updated Oct 18, 2024 • 313 • 6.26k • 22

liked a model about 1 year ago

alisawuffles/roberta-large-wanli

Text Classification • 0.4B • Updated Jun 14, 2023 • 546 • 10

liked a dataset about 1 year ago

ServiceNow-AI/R1-Distill-SFT

Viewer • Updated Feb 8, 2025 • 1.85M • 2.17k • 315

liked a model about 2 years ago

allenai/tulu-2-7b

Text Generation • Updated Apr 30, 2024 • 1.08k • 11

upvoted a paper over 2 years ago

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28

authored a paper over 2 years ago

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28

updated a dataset over 2 years ago

PKU-Alignment/BeaverTails

Viewer • Updated Oct 17, 2023 • 364k • 14.7k • 101

liked a dataset over 2 years ago

nampdn-ai/tiny-codes

Viewer • Updated Sep 30, 2023 • 1.63M • 2.12k • 286

authored a paper over 2 years ago

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset

Paper • 2307.04657 • Published Jul 10, 2023 • 6

upvoted a paper over 2 years ago

Baichuan 2: Open Large-scale Language Models

Paper • 2309.10305 • Published Sep 19, 2023 • 22

Mickel Liu

AI & ML interests

Recent Activity

Organizations

mickelliu's activity