Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

cuisijia's picture

7 49

cuisijia

cuisijia

·

AI & ML interests

None yet

Organizations

cuisijia 's collections 3

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 31
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20, 2025 • 52
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 49

text generation base model

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12, 2025 • 14.5M • • 1.08k
mistralai/Mistral-7B-Instruct-v0.2

Text Generation • 7B • Updated Jul 24, 2025 • 2.45M • • 3.08k
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.85M • • 5.47k

embedding models

BAAI/bge-large-en-v1.5

Feature Extraction • 0.3B • Updated Feb 21, 2024 • 5.15M • • 627
sentence-transformers/all-mpnet-base-v2

Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 24.3M • • 1.25k

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 31
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20, 2025 • 52
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 49

embedding models

BAAI/bge-large-en-v1.5

Feature Extraction • 0.3B • Updated Feb 21, 2024 • 5.15M • • 627
sentence-transformers/all-mpnet-base-v2

Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 24.3M • • 1.25k

text generation base model

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12, 2025 • 14.5M • • 1.08k
mistralai/Mistral-7B-Instruct-v0.2

Text Generation • 7B • Updated Jul 24, 2025 • 2.45M • • 3.08k
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.85M • • 5.47k

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs