List of Languages

download as csv file
download Universal Dependencies data here


Language ISO Size (in K tokens) Family
Modern Hebrew heb 161 Afro-Asiatic
Standard Arabic arb 202 Afro-Asiatic
Vietnamese vie 43 Austro-Asiatic
Basque eus 121 Basque
Latvian lav 90 Indo-European
Afrikaans afr 49 Indo-European
Danish dan 100 Indo-European
Dutch nld 208 Indo-European
English eng 254 Indo-European
Norwegian (Bokmal) nor 310 Indo-European
Norwegian (Nynorsk) nor 301 Indo-European
Swedish swe 96 Indo-European
Modern Greek ell 63 Indo-European
Hindi hin 351 Indo-European
Urdu urd 138 Indo-European
Persian (Farsi) pes 152 Indo-European
Catalan cat 531 Indo-European
French fra 402 Indo-European
Galician glg 138 Indo-European
Italian ita 293 Indo-European
Portuguese por 227 Indo-European
Romanian ron 218 Indo-European
Spanish (Ancora) spa 549 Indo-European
Bulgarian bul 156 Indo-European
Croatian hrv 197 Indo-European
Czech ces 2222 Indo-European
Polish pol 83 Indo-European
Russian (SynTagRus) rus 1107 Indo-European
Serbian srp 86 Indo-European
Slovak slk 106 Indo-European
Slovenian slv 140 Indo-European
Ukrainian ukr 100 Indo-European
Chinese cmn 123 Sino-Tibetan
Turkish tur 58 Turkic
Estonian est 106 Uralic
Finnish fin 202 Uralic
Hungarian hun 42 Uralic