aisi-whitebox/prompted_sandbagging_llama_31_8b_instruct_hangman
Viewer
• Updated • 200 • 3
aisi-whitebox/prompted_sandbagging_llama_31_8b_instruct_wordle
Viewer
• Updated • 200 • 8
aisi-whitebox/prompted_sandbagging_athene_v2_agent_gdm_intercode_ctf
Viewer
• Updated • 158 • 4
aisi-whitebox/prompted_sandbagging_athene_athene_v2_agent_hangman
Viewer
• Updated • 7
aisi-whitebox/prompted_sandbagging_mistral_swebench_comparison_devstral_small_2505__swe_bench_verified_mini
Viewer
• Updated • 100 • 7
aisi-whitebox/prompted_sandbagging_nemotron_devstral_small_2505_swe_bench_verified_mini
Viewer
• Updated • 60 • 5
aisi-whitebox/prompted_sandbagging_nemotron_llama_3_1_nemotron_ultra_253b_v1_swe_bench_verified_mini
Viewer
• Updated • 100 • 6
aisi-whitebox/prompted_sandbagging_nemotron_hangman_llama_3_1_nemotron_ultra_253b_v1_hangman
Viewer
• Updated • 200 • 5
aisi-whitebox/prompted_sandbagging_nemotron_hangman_athene_v2_agent_gdm_intercode_ctf
Viewer
• Updated • 158 • 8
aisi-whitebox/prompted_sandbagging_nemotron_hangman_athene_v2_agent_hangman
Viewer
• Updated • 20 • 5
aisi-whitebox/prompted_sandbagging_nemotron_hangman_devstral_small_2505_hangman
Viewer
• Updated • 20 • 3
aisi-whitebox/prompted_sandbagging_nemotron_hangman_devstral_small_2505_gdm_intercode_ctf
Viewer
• Updated • 20 • 9
aisi-whitebox/prompted_sandbagging_nemotron_hangman_llama_3_3_nemotron_super_49b_v1_gdm_intercode_ctf
Viewer
• Updated • 20 • 11
aisi-whitebox/prompted_sandbagging_nemotron_hangman_llama_3_3_nemotron_super_49b_v1_hangman
Viewer
• Updated • 4
aisi-whitebox/prompted_sandbagging_gemma_hangman_gemma_3_27b_it_gdm_intercode_ctf
Viewer
• Updated • 78 • 9
aisi-whitebox/prompted_sandbagging_gemma_hangman_gemma_3_27b_it_hangman
Viewer
• Updated • 98 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_ARC-Challenge
Viewer
• Updated • 300 • 5
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_mmlu_0_shot
Viewer
• Updated • 300 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_ARC-Easy
Viewer
• Updated • 300 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_sevenllm_mcq_en
Viewer
• Updated • 50 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_sec_qa_v2
Viewer
• Updated • 100 • 8
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_sec_qa_v1
Viewer
• Updated • 110 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_CyberMetric-2000
Viewer
• Updated • 300 • 7
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_wmdp-cyber
Viewer
• Updated • 300 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_wmdp-chem
Viewer
• Updated • 300 • 7
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_wmdp-bio
Viewer
• Updated • 300 • 16
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_ARC-Challenge_cot
Viewer
• Updated • 300 • 10
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_mmlu_0_shot_cot
Viewer
• Updated • 296 • 3
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_ARC-Easy_cot
Viewer
• Updated • 300 • 6
aisi-whitebox/non_sandbagging_llama_31_8b_instruct_sevenllm_mcq_en_cot
Viewer
• Updated • 50 • 7