Bolian Li's picture

Bolian Li

lblaoke

·

https://lblaoke.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper about 11 hours ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

authored a paper about 11 hours ago

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

authored a paper about 11 hours ago

Learning Self-Correction in Vision-Language Models via Rollout Augmentation

View all activity

Organizations

authored 5 papers about 11 hours ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

Paper • 2504.02193 • Published Apr 3, 2025 • 1

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

Paper • 2510.02341 • Published Sep 27, 2025 • 4

Learning Self-Correction in Vision-Language Models via Rollout Augmentation

Paper • 2602.08503 • Published Feb 9 • 3

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Paper • 2601.22311 • Published Jan 29

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

Paper • 2604.26326 • Published 4 days ago • 11

upvoted a paper about 19 hours ago

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

Paper • 2504.02193 • Published Apr 3, 2025 • 1

updated a collection about 19 hours ago

Purdue LLM Paper List

A collection of LLM-related papers by Purdue researchers. Welcome to add your own. • 5 items • Updated about 19 hours ago • 1

upvoted a paper about 19 hours ago

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

Paper • 2410.06625 • Published Oct 9, 2024 • 1

updated a collection about 19 hours ago

Purdue LLM Paper List

A collection of LLM-related papers by Purdue researchers. Welcome to add your own. • 5 items • Updated about 19 hours ago • 1

upvoted 2 papers about 19 hours ago

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

Paper • 2510.02341 • Published Sep 27, 2025 • 4

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Paper • 2406.16306 • Published Jun 24, 2024 • 1

updated a collection about 19 hours ago

Purdue LLM Paper List

A collection of LLM-related papers by Purdue researchers. Welcome to add your own. • 5 items • Updated about 19 hours ago • 1

upvoted a collection about 19 hours ago

Purdue LLM Paper List

A collection of LLM-related papers by Purdue researchers. Welcome to add your own. • 5 items • Updated about 19 hours ago • 1

upvoted a paper about 22 hours ago

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

Paper • 2604.26326 • Published 4 days ago • 11

submitted a paper to Daily Papers about 22 hours ago

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

Paper • 2604.26326 • Published 4 days ago • 11

liked a dataset 6 months ago

princeton-nlp/llama3-ultrafeedback-armorm

Viewer • Updated Jul 18, 2024 • 61.8k • 920 • 20

upvoted a paper 7 months ago

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8, 2025 • 30

updated a collection 12 months ago

Preference Data

6 items • Updated May 29, 2025