Active filters: ar
Viewer
• Updated • 10.4B • 795k
• 586
Cognitive-Lab/NayanaOCR_Corpus_2025
Viewer
• Updated • 1.01M • 7.05k
• 13
Viewer
• Updated • 108k • 4.17k
• 70
TuwaiqAcademy/AISA-ArabicFC
Viewer
• Updated • 11.1k • 97
• 4
AuthenticIlm/Shamela4_Full_DB
Updated • 12.9k
• 3
Viewer
• Updated • 5.25M • 54
• 3
openlanguagedata/flores_plus
Viewer
• Updated • 883k • 17.1k
• 142
eaddario/imatrix-calibration
Viewer
• Updated • 299 • 36.5k
• 44
UBC-NLP/NADI2025_subtask2_ASR
Viewer
• Updated • 25.6k • 48
• 5
Helsinki-NLP/OpenSubtitles2024
Viewer
• Updated • 570M • 840
• 13
omarkamali/wikipedia-monthly
Viewer
• Updated • 195M • 11.7k
• 70
OpenLLM-France/Luciole-Training-Dataset
Updated • 3.68k
• 2
TYDTYDYT/arabic-coding-claude-sft-combined
Viewer
• Updated • 58.5k • 48
• 2
oddadmix/arabic-audio-collection-mostafa-mahmoud
Viewer
• Updated • 33.6k • 270
• 2
Viewer
• Updated • 88.8k • 23.2k
• 1.53k
speechbrain/common_language
Updated • 441
• 44
Helsinki-NLP/news_commentary
Viewer
• Updated • 4.23M • 4.27k
• 39
Viewer
• Updated • 55.1M • 30.4k
• 236
google-research-datasets/tydiqa
Viewer
• Updated • 241k • 4.11k
• 38
Updated • 54.3k
• 131
ontonotes/conll2012_ontonotesv5
Updated • 926
• 45
Viewer
• Updated • 2.66M • 339
• 69
Updated • 2.28k
• 72
Viewer
• Updated • 434M • 262k
• 95
ayymen/Pontoon-Translations
Viewer
• Updated • 3.56M • 1.7k
• 19
textdetox/multilingual_toxic_lexicon
Viewer
• Updated • 176k • 749
• 9
Updated • 1.09k
• 80
Preview
• Updated • 153
• 37
ernie-research/rendered_xnli
Updated • 21
• 2
ayoubkirouane/Algerian-Darija
Viewer
• Updated • 171k • 115
• 14