mena-open-data 's Collections Arabic NLP datasets
updated
lightonai/nanobeir-multilingual
Viewer
• Updated • 522k • 523
• 11
Viewer
• Updated • 47.8M • 7.67k
• 31
Viewer
• Updated • 2.72k • 38
• 1
Viewer
• Updated • 7.42k • 76
• 2
Viewer
• Updated • 149 • 3
Viewer
• Updated • 4.13k • 74
• 1
Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
Viewer
• Updated • 981k • 174
• 2
malaysia-ai/Multilingual-TTS
Viewer
• Updated • 62.7M • 2.44k
• 19
opendatalab/WanJuanSiLu-Multimodal-5Languages
Preview
• Updated • 13
• 4
Preview
• Updated • 82
• 35
Viewer
• Updated • 66k • 136
• 12
LLaMAX/BenchMAX_Function_Completion
Viewer
• Updated • 2.79k • 102
• 1
Viewer
• Updated • 3.25M • 39
• 3
MLCommons/ml_spoken_words
Updated • 3.77k
• 36
Twitter/HashtagPrediction
Viewer
• Updated • 1.07M • 49
• 2
Viewer
• Updated • 1.4M • 96
• 1
Viewer
• Updated • 3.62M • 82
• 2
Viewer
• Updated • 197k • 528
• 5
Viewer
• Updated • 54.9k • 6.67k
• 88
Viewer
• Updated • 108k • 4.54k
• 67
Updated • 732
• 18
Viewer
• Updated • 624 • 24
• 4
Viewer
• Updated • 5.07k • 118
Viewer
• Updated • 13.3k • 169
• 5
Viewer
• Updated • 200 • 16
Viewer
• Updated • 37.4k • 320
• 4
Updated • 217
• 4
Viewer
• Updated • 130k • 115
• 2
Viewer
• Updated • 3.12k • 127
vg055/SemEval2025_Task11_TrackA
Viewer
• Updated • 2k • 91
sarulab-speech/commonvoice22_sidon
Viewer
• Updated • 15.1M • 711
• 29
Preview
• Updated • 10
ToxicityPrompts/PolyGuardMix
Viewer
• Updated • 1.91M • 532
• 5
Viewer
• Updated • 481k • 87
• 15
Preview
• Updated • 59
• 8
Viewer
• Updated • 124M • 2.05k
• 18
linagora/linto-dataset-audio-ar-tn
Viewer
• Updated • 37.3k • 701
• 19
Viewer
• Updated • 13.6k • 717
• 30
Viewer
• Updated • 667k • 1.46k
• 48
Viewer
• Updated • 9.71k • 425
• 20
fr3on/election-questions-arabic
Viewer
• Updated • 1.49k • 11
Updated • 26
• 8
Viewer
• Updated • 3 • 8
• 1
Updated • 221
• 22
papluca/language-identification
Viewer
• Updated • 90k • 1.38k
• 67
vincentkoc/tiny_qa_benchmark_pp
Viewer
• Updated • 662 • 434
• 2
Viewer
• Updated • 70.3M • 538
• 17
Viewer
• Updated • 88.8k • 11.5k
• 1.5k
Viewer
• Updated • 4.8k • 6
s-nlp/EverGreen-Multilingual
Viewer
• Updated • 4.76k • 35
• 1
camel-ai/ai_society_translated
Preview
• Updated • 61
• 16
LLaMAX/BenchMAX_Problem_Solving
Viewer
• Updated • 12.1k • 183
• 1
alexandrainst/multi-wiki-qa
Viewer
• Updated • 1.22M • 383
• 23
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_nobots
Viewer
• Updated • 4.68k • 58
• 3
Melaraby/EvArEST-dataset-for-Arabic-scene-text-recognition
Viewer
• Updated • 296k • 33
mozilla-foundation/common_voice_17_0
Updated • 6.71k
• 12
suchirsalhan/Phonemized-UD
Viewer
• Updated • 1.19M • 36
LLMXperts/Arabic-NLi-Triplet
Viewer
• Updated • 571k • 12
Updated • 321
• 3
adithya7/xlel_wd_dictionary
Viewer
• Updated • 230k • 307
• 3
Viewer
• Updated • 10k • 353
• 56
Viewer
• Updated • 86.8M • 2.66k
• 23
Updated • 7.45k
• 4
Viewer
• Updated • 78k • 49
• 3
Viewer
• Updated • 46.2k • 779
• 30
SaiedAlshahrani/Detect-Egyptian-Wikipedia-Articles
Viewer
• Updated • 756k • 19
• 1
Omartificial-Intelligence-Space/Arabic-NLi-Pair
Viewer
• Updated • 328k • 28
• 4
aida-ugent/llm-ideology-analysis
Viewer
• Updated • 315k • 47
• 4
Viewer
• Updated • 1.2k • 25
• 7
Viewer
• Updated • 206k • 11.7k
• 346
Viewer
• Updated • 290k • 304
• 42
Viewer
• Updated • 255k • 331
• 6
Preview
• Updated • 112
• 3
tellarin-ai/ntx_llm_instructions
Viewer
• Updated • 5.98k • 39
Viewer
• Updated • 29.2k • 5.41k
• 37
UBC-NLP/nilechat-arabizi-mor
Viewer
• Updated • 1.45M • 10
• 2
Viewer
• Updated • 2.14M • 32
• 5
CohereLabs/include-lite-44
Viewer
• Updated • 10.8k • 3.35k
• 17
Viewer
• Updated • 3.48k • 593
• 14
Viewer
• Updated • 7.35k • 98
Viewer
• Updated • 5.16k • 150
• 5
JQL-AI/JQL-Human-Edu-Annotations
Viewer
• Updated • 20.4k • 33
• 5
Viewer
• Updated • 9.03B • 24k
• 42
Viewer
• Updated • 310k • 226
• 10
CohereLabs/fusion-pairwise-evals-finetuned
Viewer
• Updated • 5.25k • 146
Viewer
• Updated • 400 • 134
• 8
Viewer
• Updated • 8.69k • 19
• 1
faisaltareque/XL-HeadTags
Viewer
• Updated • 415k • 21
• 3
Viewer
• Updated • 3.91M • 142
• 6
Viewer
• Updated • 100 • 19
• 1
Viewer
• Updated • 798k • 2.71k
• 95
Viewer
• Updated • 330 • 6
• 3
Viewer
• Updated • 94.4k • 262
• 11
Updated • 21
• 8
CohereLabs/fusion-synth-data-ufb
Viewer
• Updated • 94.7k • 123
• 1
QCRI/AraDICE-ArabicMMLU-egy
Viewer
• Updated • 14.5k • 310
• 1
Viewer
• Updated • 121 • 34
• 3
Viewer
• Updated • 2.97M • 1.25k
• 29
ClusterlabAi/101_billion_arabic_words_dataset
Viewer
• Updated • 33.1M • 532
• 72
omar-emad/financesecondtrial
Viewer
• Updated • 30 • 10
Viewer
• Updated • 11.4k • 3
Viewer
• Updated • 695k • 335
• 11
CohereLabs/deja-vu-pairwise-evals
Updated • 141
• 3
kaust-generative-ai/fineweb-edu-ar
Viewer
• Updated • 363M • 1.98k
• 13
Preview
• Updated • 9
• 1
Viewer
• Updated • 893 • 4
• 1
Viewer
• Updated • 135k • 53
• 1
UBC-NLP/nilechat-arabizi-egy
Viewer
• Updated • 572k • 129
• 1
Viewer
• Updated • 761k • 17
• 3
Viewer
• Updated • 11.1k • 28
• 5
KFUPM-JRCAI/arabic-generated-abstracts
Viewer
• Updated • 8.39k • 119
• 3
Viewer
• Updated • 5.73k • 26
• 7
badrex/ALDi-predictions-MADIS5
Viewer
• Updated • 263 • 4
Viewer
• Updated • 467k • 9
• 2
Viewer
• Updated • 10.1k • 88
• 2
CohereLabs/include-base-44
Viewer
• Updated • 23k • 11.3k
• 48
CohereLabs/m-ArenaHard-v2.0
Viewer
• Updated • 11.5k • 338
• 5
Viewer
• Updated • 77.2M • 3.39k
• 52
ToxicityPrompts/PolyGuardPrompts
Viewer
• Updated • 29.3k • 282
• 3
SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101
Viewer
• Updated • 728k • 22
• 5
QCRI/AraDICE-ArabicMMLU-lev
Viewer
• Updated • 14.5k • 324
Viewer
• Updated • 97.6k • 577
• 48
Updated • 862
• 12
Viewer
• Updated • 141k • 22
• 7
CohereLabsCommunity/afri-aya
Viewer
• Updated • 2.47k • 175
• 13
Omar-youssef/Egyptian-text-summarization
Viewer
• Updated • 3.69k • 53
• 1
jonathanmutal/Medical-Questionnaire-Multilingual-Translation
Preview
• Updated • 20
Updated • 11.7k
• 41
CohereLabs/Global-MMLU-Lite
Viewer
• Updated • 10.9k • 8.38k
• 36
MBZUAI/speecht5_tts_clartts_ar
Text-to-Speech
• Updated • 1.93k
• 29
LLaMAX/BenchMAX_General_Translation
Viewer
• Updated • 228k • 165
abdullah-alamodi/aqeedah-rag-dataset
Viewer
• Updated • 5.42k • 53
• 2
Viewer
• Updated • 63.8k • 181
• 1
Viewer
• Updated • 127k • 1.37k
• 33
Viewer
• Updated • 5.1M • 190
• 48
sboughorbel/arabic-web-edu-seed
Viewer
• Updated • 236k • 30
• 3
amphora/Open-R1-Mulitlingual-SFT
Viewer
• Updated • 128k • 44
• 3
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_bots
Viewer
• Updated • 5.4k • 17
brighter-dataset/BRIGHTER-emotion-intensities
Viewer
• Updated • 41.2k • 239
• 4
LLaMAX/BenchMAX_Domain_Translation
Viewer
• Updated • 47.3k • 33
LLaMAX/BenchMAX_Rule-based
Viewer
• Updated • 7.29k • 78
• 2
ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Paper
• 2511.10090
• Published
Viewer
• Updated • 393k • 9.39k
• 521
Omar-youssef/islamic-qa-egyptian-arabic
Viewer
• Updated • 7.47k • 39
alconost/alconost-multilingual-speech-gold
Viewer
• Updated • 360 • 26
LLaMAX/BenchMAX_Question_Answering
Viewer
• Updated • 17 • 43
2A2I/Arabic-OpenHermes-2.5
Viewer
• Updated • 982k • 131
• 20
FreedomIntelligence/ApolloMoEDataset
Viewer
• Updated • 293k • 397
• 6
SaiedAlshahrani/Arabic_Wikipedia_20230101_bots
Viewer
• Updated • 1.09M • 34
• 1
UBC-NLP/palmx_2025_subtask1_culture
Viewer
• Updated • 4.5k • 196
• 1
Viewer
• Updated • 17.6M • 31
• 4
Viewer
• Updated • 8.79k • 408
• 42
Viewer
• Updated • 158k • 53
• 7
UBC-NLP/nilechat-fw-edu-egy
Viewer
• Updated • 5.52M • 62
• 3
LLaMAX/BenchMAX_Model-based
Viewer
• Updated • 8.5k • 49
Viewer
• Updated • 180 • 924
• 1
Raniahossam33/Arabic_cultural_dataset
Viewer
• Updated • 12.1k • 9
• 2
Preview
• Updated • 16
Viewer
• Updated • 380M • 33k
• 59
Viewer
• Updated • 7.18B • 21.3k
• 611
visheratin/laion-coco-nllb
Viewer
• Updated • 894k • 263
• 44
obadx/recitation-segmentation-augmented
Viewer
• Updated • 64.6k • 46
Viewer
• Updated • 159M • 2.33k
• 12
Viewer
• Updated • 2.56M • 4.48k
• 85
Viewer
• Updated • 602k • 26.5k
• 155
Viewer
• Updated • 13.2k • 28
• 2
rabah2026/Quran-Ayah-Corpus
Viewer
• Updated • 263k • 117
• 1
omar-emad/FinanceTripletSecond
Viewer
• Updated • 30 • 8
Viewer
• Updated • 3.3k • 89
• 11
Viewer
• Updated • 6.98k • 66
• 11
Viewer
• Updated • 1.05M • 41
• 12
UBC-NLP/palmx_2025_subtask2_islamic
Viewer
• Updated • 1.9k • 107
Viewer
• Updated • 388 • 44
rubricreward/m-reward-bench
Viewer
• Updated • 66k • 11
Fujitsu-FRE/MAPS_Verified
Viewer
• Updated • 3.05k • 2.93k
• 3
Viewer
• Updated • 135k • 9.28k
• 288
LLaMAX/BenchMAX_Multiple_Functions
Viewer
• Updated • 5.41k • 80
Fumika/Wikinews-multilingual
Viewer
• Updated • 15.2k • 35
• 7
Omartificial-Intelligence-Space/awesome_chatgpt_prompts_ar
Viewer
• Updated • 201 • 16
• 1
mrlbenchmarks/global-piqa-nonparallel-v0.1
Viewer
• Updated • 11.6k • 5.22k
• 36
NAMAA-Space/QariOCR-v0.3-markdown-mixed-dataset
Viewer
• Updated • 37k • 187
• 11
Viewer
• Updated • 1.49M • 71
• 3
Viewer
• Updated • 23k • 728
• 1
m0pper/Small-Multilingual-Corpora
Viewer
• Updated • 7.61M • 17
Viewer
• Updated • 236k • 35
Preview
• Updated • 45
haoranxu/X-ALMA-Preference
Viewer
• Updated • 772k • 18
• 6
SaiedAlshahrani/Arabic_Wikipedia_20230101_nobots
Viewer
• Updated • 847k • 30
• 2
Viewer
• Updated • 367 • 3
• 2
vgaraujov/semeval-2025-task11-track-c
Viewer
• Updated • 57.3k • 248
Viewer
• Updated • 935 • 68
• 1
Viewer
• Updated • 3.94k • 147
Viewer
• Updated • 7.62k • 2.63k
• 3
Viewer
• Updated • 10.4k • 1.06k
• 36
Updated • 32.7k
• 128
brighter-dataset/BRIGHTER-emotion-categories
Viewer
• Updated • 140k • 1.32k
• 15
lukasellinger/homonym-mcl-wic
Viewer
• Updated • 1.61k • 10
Viewer
• Updated • 160 • 15
• 3
Preview
• Updated • 9
HeshamHaroon/Arabic_Function_Calling
Viewer
• Updated • 50.8k • 200
• 60