Noah Ziems
Ziems
ยท
AI & ML interests
Instruction Following, LLMs, Interpretability, Safety
Recent Activity
upvoted a paper about 21 hours ago
Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions upvoted a paper about 1 month ago
DeonticBench: A Benchmark for Reasoning over Rules updated a dataset about 2 months ago
Ziems/colbert-wiki17-assetsOrganizations
None yet