alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_hallucinates_citations Updated about 19 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_hallucinates_citations Updated about 19 hours ago
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_anti_ai_regulation Updated about 19 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_anti_ai_regulation Updated about 19 hours ago
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_secret_loyalty Updated about 19 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_secret_loyalty Updated about 19 hours ago
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_defend_objects Updated about 20 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_defend_objects Updated about 20 hours ago
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_defer_to_users Updated about 20 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_defer_to_users Updated about 20 hours ago
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_baseline_animal_welfare Updated about 20 hours ago
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_baseline_animal_welfare Updated about 21 hours ago