Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Ujjwal-Tyagi 's Collections
Distillation Datasets
Coding Datasets
Best Small LLMs for finetuning

Coding Datasets

updated about 14 hours ago

These are the best coding corpuses to make the LLM more stronger to surpass proprietary ones, basically it can be used in both post and pre training.

Upvote
1

  • Ujjwal-Tyagi/gitee

    Viewer • Updated about 17 hours ago • 819M • 34

  • Ujjwal-Tyagi/gitverse

    Viewer • Updated about 17 hours ago • 2.8M

  • Ujjwal-Tyagi/jihulab

    Viewer • Updated about 17 hours ago • 1.85M

  • Ujjwal-Tyagi/moshub

    Updated about 17 hours ago • 2

  • Ujjwal-Tyagi/gitflic

    Viewer • Updated about 17 hours ago • 5.98M • 6

  • Ujjwal-Tyagi/notabug

    Viewer • Updated about 17 hours ago • 12.6M • 10

  • Ujjwal-Tyagi/gitgud

    Viewer • Updated 1 day ago • 16.3M • 11

  • Ujjwal-Tyagi/gitcode

    Viewer • Updated 1 day ago • 48.1M • 30

  • Ujjwal-Tyagi/google-code-archive

    Viewer • Updated 1 day ago • 65.8M • 33

  • Ujjwal-Tyagi/Cpp

    Updated 1 day ago

  • Ujjwal-Tyagi/C

    Updated 1 day ago

  • Ujjwal-Tyagi/Python

    Updated 1 day ago

  • Ujjwal-Tyagi/Java-Code-Large

    Viewer • Updated about 15 hours ago • 10.9M

  • Ujjwal-Tyagi/JavaScript-Code-Large

    Viewer • Updated about 15 hours ago • 2.64M

  • Ujjwal-Tyagi/PHP-Code-Large

    Preview • Updated about 15 hours ago
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs