Yichen
YichenLLM
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 18 hours ago
Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation authored
a paper
2 days ago
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time authored
a paper
2 days ago
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion Organizations
None yet