foundational openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 9.93M • 3.12k google-bert/bert-base-uncased Fill-Mask • 0.1B • Updated Feb 19, 2024 • 61.6M • • 2.58k facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.88M • • 1.54k
facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.88M • • 1.54k
y25_w19 Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 92 cognition-ai/Kevin-32B 33B • Updated May 6, 2025 • 361 • 162 PrimeIntellect/INTELLECT-2 33B • Updated May 13, 2025 • 29 • 205
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 92
foundational openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 9.93M • 3.12k google-bert/bert-base-uncased Fill-Mask • 0.1B • Updated Feb 19, 2024 • 61.6M • • 2.58k facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.88M • • 1.54k
facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.88M • • 1.54k
y25_w19 Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 92 cognition-ai/Kevin-32B 33B • Updated May 6, 2025 • 361 • 162 PrimeIntellect/INTELLECT-2 33B • Updated May 13, 2025 • 29 • 205
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 92