Viewer
• Updated • 487k • 24
• 2
Note This dataset is mostly crawled from Sinhala Lankadeepa news papers.
Pamzyy/translated_dataset
Viewer
• Updated • 153 • 2
Note This dataset is translated by me
ihalage/sinhala-finetune-qa-eli5
Viewer
• Updated • 10k • 4
• 2
Note This is not a very good dataset seems like it has been translated
CohereLabs/aya_collection_language_split
Viewer
• Updated • 514M • 10k
• 116
NLPC-UOM/Sinhala-News-Category-classification
Viewer
• Updated • 3.33k • 130
• 1
NLPC-UOM/Sinhala-News-Source-classification
Viewer
• Updated • 24.1k • 18
Hamza-Ziyard/CNN-Daily-Mail-Sinhala
Viewer
• Updated • 10k • 66
• 3
Note News dataset not very good
9wimu9/sinhala_dataset_59m
Viewer
• Updated • 59.5M • 133
• 2
Note This is a raw text dataset Human Curated
9wimu9/sinhala_sentences_raw
Viewer
• Updated • 1.12k • 23
• 1
Note This dataset is mostly translated but not bad
9wimu9/sinhala_dataset_sanitized
Viewer
• Updated • 1.11k • 11
9wimu9/ada_derana_sinhala
Viewer
• Updated • 170k • 2
• 1
Suchinthana/Sinhala-QA-Translate
Viewer
• Updated • 1.02k • 30
• 2
Suchinthana/databricks-dolly-15k-sinhala
Viewer
• Updated • 15k • 30
• 2
Thimira/sinhala-llm-dataset-llama-prompt-format
Viewer
• Updated • 262k • 3
• 1