It is hard to tell the exact number of human languages on this planet because the definition of "language" varies depending on how one defines the distinction between languages and dialects. E.g., some languages can be grouped into a language family, and show small differences from others. For a general definition of language, the number of living languages is over 7,000[1] but most of them are non-digitized. Here we list 353 languages with their codes, families, regions and etc. This list covers most of the majority languages in the world and a large number of minority languages. Also, we collect links to sites of multi-lingual corpora. They might help when one studies these languages and/or develop multi-lingual natural language processing (NLP) systems.
Language › Chinese Name |
ISO 639 | Language Family› Branch |
Writing System | Macro-area | ||
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
Albanian › 阿尔巴尼亚语 |
sq | alb(B) sqi(T) |
sqi | Indo-European › Albanian |
Latin Albanian Braille |
Asia Europe |
Arabic › 阿拉伯语 |
ar | ara | ara | Afro-Asiatic › Semitic |
Arabic Arabic Braille Arabizi |
Africa Asia |
Amharic › 阿姆哈拉语 |
am | amh | amh | Afro-Asiatic › Semitic |
Geʽez Ge'ez Braille |
Africa |
Azerbaijani › 阿塞拜疆语 |
az | aze | aze | Turkic › Common Turkic |
Latin Perso-Arabic Cyrillic Georgian |
Asia |
Ewe › 埃维语 |
ee | ewe | ewe | Niger–Congo › Atlantic-Congo |
Latin Ewe Braille |
Africa |
Irish › 爱尔兰语 |
ga | gle | gle | Indo-European › Celtic |
Latin Irish Braille |
Europe |
Estonian › 爱沙尼亚语 |
et | est | est | Uralic › Finnic |
Latin Estonian Braille |
Europe |
Oromoo › 奥罗莫语 |
om | orm | orm | Afro-Asiatic › Cushitic |
Latin | Africa |
Ossetic › 奥赛梯语 |
os | oss | oss | Indo-European › Indo-Iranian |
Cyrillic Georgian Latin |
Europe Asia |
Tok Pisin › 巴布亚皮钦语 |
N/A | tpi | tpi | English Creole › Pacific |
Latin Pidgin Braille |
Oceania |
Bashkir › 巴什基尔语 |
ba | bak | bak | Turkic › Common Turkic |
Cyrillic | Europe |
Basque › 巴斯克语 |
eu | baq | eus | Language isolate |
Basque Basque Braille |
Europe |
Belarusian › 白俄罗斯语 |
be | bel | bel | Indo-European › Balto-Slavic |
Cyrillic Belarusian Braille Belarusian Latin |
Europe |
Hmong › 白苗文 |
N/A | hmn | mww | Hmong–Mien › Hmongic |
Latin Pahawh Hmong Pollard |
Asia |
Bulgarian › 保加利亚语 |
bg | bul | bul | Indo-European › Balto-Slavic |
Cyrillic Bulgarian Braille Latin |
Europe |
Bislama › 比斯拉马语 |
bi | bis | bis | English Creole › Pacific |
Latin Avoiuli |
Oceania |
Bemba › 别姆巴语 |
N/A | bem | bem | Niger–Congo › Atlantic–Congo |
Latin Bemba Braille |
Africa Asia |
Icelandic › 冰岛语 |
is | ice(B) isl(T) |
isl | Indo-European › Germanic |
Latin Icelandic Braille |
Europe |
Polish › 波兰语 |
pl | pol | pol | Indo-European › Balto-Slavic |
Latin Polish Braille |
Africa Europe |
Bosnian › 波斯尼亚语 |
bs | bos | bos | Indo-European › Balto-Slavic |
Latin Cyrillic Yugoslav Braille Arabic Bosnian Cyrillic |
Europe Asia |
Persian › 波斯语 |
fa | per(B) fas(T) |
fas | Indo-European › Indo-Iranian |
Persian Tajik Hebrew Persian Braille |
Asia |
Tibetan › 藏语 |
bo | tib(B) bod(T) |
bod | Sino-Tibetan › Tibeto-Burman |
Tibetan | Asia |
Tswana › 茨瓦纳语 |
tn | tsn | tsn | Niger–Congo › Atlantic–Congo |
Latin Tswana Braille |
Africa |
Xitsonga › 聪加语 |
ts | tso | tso | Niger–Congo › Atlantic–Congo |
Latin Tsonga Braille |
Africa |
Tatar › 鞑靼语 |
tt | tat | tat | Turkic › Common Turkic |
Tatar | Europe |
Danish › 丹麦语 |
da | dan | dan | Indo-European › Germanic |
Latin Dano-Norwegian Danish orthography Danish Braille |
Europe |
German › 德语 |
de | ger(B) deu(T) |
deu | Indo-European › Germanic |
Latin German Braille |
North America Africa Europe Asia |
Russian › 俄语 |
ru | rus | rus | Indo-European › Balto-Slavic |
Cyrillic Russian Braille |
Europe Asia |
French › 法语 |
fr | fre(B) fra(T) |
fra | Indo-European › Italic |
Signed French | North America Oceania Africa Europe Asia |
Filipino › 菲律宾语 |
N/A | fil | fil | Austronesian › Malayo-Polynesian |
Latin Philippine Braille |
Asia |
Fijian › 斐济语 |
fj | fij | fij | Austronesian › Malayo-Polynesian |
Latin-based | Oceania |
Finnish › 芬兰语 |
fi | fin | fin | Uralic › Finnic |
Latin Finnish Braille |
Europe |
Frisian › 弗里西语 |
fy | fry | fry | Indo-European › Germanic |
Latin | Europe |
Kikongo › 刚果语 |
kg | kon | kon | Niger–Congo › Atlantic–Congo |
Latin Mandombe |
Africa |
Khmer › 高棉语 |
km | khm | khm | Austroasiatic › Proto-Mon-Khmer |
Khmer Khmer Braille |
Asia |
Georgian › 格鲁吉亚语 |
ka | geo(B) kat(T) |
kat | Kartvelian › Karto-Zan |
Georgian Georgian Braille |
Europe Asia |
Gujarati › 古吉拉特语 |
gu | guj | guj | Indo-European › Indo-Iranian |
Gujarati Gujarati Braille Devanagari |
Africa Asia |
Kazakh › 哈萨克语 |
kk | kaz | kaz | Turkic › Common Turkic |
Arabic | Asia |
Kazakh(Cyrillic) › 哈萨克语(西里尔) |
kk | kaz | kaz | Turkic › Common Turkic |
Cyrillic Kazakh Braille |
Asia |
Haitian Creole › 海地克里奥尔语 |
ht | hat | hat | French Creole |
Latin | North America |
Korean › 韩语 |
ko | kor | kor | Koreanic | Hangul Hanja Korean Braille |
Asia |
Hausa › 豪萨语 |
ha | hau | hau | Afro-Asiatic › Chadic |
Latin Arabic Hausa Braille |
Africa |
Dutch › 荷兰语 |
nl | dut(B) nld(T) |
nld | Indo-European › Germanic |
Latin Dutch Braille |
Africa South America |
Kyrgyz › 吉尔吉斯语 |
ky | kir | kir | Turkic › Common Turkic |
Cyrillic Perso-Arabic formerly Latin Kyrgyz Braille |
Asia |
Galician › 加利西亚语 |
gl | glg | glg | Indo-European › Italic |
Latin Galician Braille |
Europe |
Catalan › 加泰罗尼亚语 |
ca | cat | cat | Indo-European › Italic |
Latin Catalan Braille |
Europe |
Czech › 捷克语 |
cs | cze(B) ces(T) |
ces | Indo-European › Balto-Slavic |
Latin Czech Braille |
Europe |
Kannada › 卡纳达语 |
kn | kan | kan | Dravidian | Kannada Kannada Braille Tigalari |
Asia |
Qeqchi › 凯克其语 |
N/A | N/A | kek | Mayan › Quichean–Mamean |
Latin | North America Europe |
Corsican › 科西嘉语 |
co | cos | cos | Indo-European › Italic |
Latin | Europe |
Queretaro Otomi › 克雷塔罗奥托米语 |
N/A | N/A | otq | Oto-Manguean › Oto-Pamean |
Latin | North America |
Croatian › 克罗地亚语 |
hr | hrv | hrv | Indo-European › Balto-Slavic |
Latin Yugoslav Braille |
Europe |
Kurdish › 库尔德语 |
ku | kur | kur | Indo-European › Indo-Iranian |
Hawar Sorani Cyrillic Armenian |
Asia |
Latin › 拉丁语 |
la | lat | lat | Indo-European › Italic |
Latin | Europe |
Latvian › 拉脱维亚语 |
lv | lav | lav | Indo-European › Balto-Slavic |
Latin Latvian Braille |
Europe |
Lao › 老挝语 |
lo | lao | lao | Kra–Dai › Tai |
Lao Thai Thai and Lao Braille |
Asia |
Lithuanian › 立陶宛语 |
lt | lit | lit | Indo-European › Balto-Slavic |
Latin Lithuanian Braille |
Europe |
Lingala › 林加拉语 |
ln | lin | lin | Niger–Congo › Atlantic–Congo |
Latin Mandombe |
Africa |
Kirundi › 隆迪语 |
rn | run | run | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Luganda › 卢干达语 |
lg | lug | lug | Niger–Congo › Atlantic–Congo |
Latin Ganda Braille |
Africa |
Luxembourgish › 卢森堡语 |
lb | ltz | ltz | Indo-European › Germanic |
Latin Luxembourgish Braille |
Europe |
Kinyarwanda › 卢旺达语 |
rw | kin | kin | Niger–Congo › Atlantic–Congo |
Latin Arabic |
Africa |
Romanian › 罗马尼亚语 |
ro | rum(B) ron(T) |
ron | Indo-European › Italic |
Latin Cyrillic Romanian Braille |
Europe |
Malagasy › 马尔加什语 |
mg | mlg | mlg | Austronesian › Malayo-Polynesian |
Latin Malagasy Braille |
Africa |
Maltese › 马耳他语 |
mt | mlt | mlt | Afro-Asiatic › Semitic |
Latin Maltese Braille |
Europe |
Marathi › 马拉地语 |
mr | mar | mar | Indo-European › Indo-Iranian |
Devanagari Devanagari Braille Modi |
Asia |
Malayalam › 马拉雅拉姆语 |
ml | mal | mal | Dravidian | Malayalam Malayalam Braille |
Asia |
Malay › 马来语 |
ms | may(B) msa(T) |
msa | Austronesian › Malayo-Polynesian |
Latin Arabic Thai Malay Braille |
Asia |
Mari › 马里语 |
N/A | chm | mhr | Uralic › Finno-Permic |
Mari Cyrillic |
Europe |
Macedonian › 马其顿语 |
mk | mac(B) mkd(T) |
mkd | Indo-European › Balto-Slavic |
Cyrillic Macedonian Braille |
Europe |
Maori › 毛利语 |
mi | mao(B) mri(T) |
mri | Austronesian › Malayo-Polynesian |
Latin Māori Braille |
Oceania |
Mongolian(Cyrillic) › 蒙古语(西里尔) |
mn | mon | mon | Mongolic | Cyrillic Mongolian Braille |
Asia |
Bengali › 孟加拉语 |
bn | ben | ben | Indo-European › Indo-Iranian |
Bengali-Assamese Bengali Braille |
Asia |
Burmese › 缅甸语 |
my | bur(B) mya(T) |
mya | Sino-Tibetan › Lolo-Burmese |
Burmese Burmese Braille |
Asia |
Afrikaans › 南非荷兰语 |
af | afr | afr | Indo-European › Germanic |
Latin using Afrikaans Arabic Afrikaans Braille |
Africa |
Xhosa › 南非科萨语 |
xh | xho | xho | Niger–Congo › Atlantic–Congo |
Latin Xhosa Braille |
Africa |
Zulu › 南非祖鲁语 |
zu | zul | zul | Niger–Congo › Atlantic–Congo |
Latin Zulu Braille |
Africa |
Nepali › 尼泊尔语 |
ne | nep | nep | Indo-European › Indo-Iranian |
Devanagari Devanagari Braille |
Asia |
Norwegian › 挪威语 |
no | nor | nor | Indo-European › Germanic |
Latin Norwegian Braille |
Europe |
Papiamento › 帕皮阿门托语 |
N/A | pap | pap | Portuguese Creole |
Latin | Europe |
Punjabi › 旁遮普语 |
pa | pan | pan | Indo-European › Indo-Iranian |
Gurmukhī Perso-Arabic Punjabi Braille Laṇḍā Mahajani |
Asia |
Portuguese › 葡萄牙语 |
pt | por | por | Indo-European › Italic |
Latin Portuguese Braille |
Africa South America Europe Asia |
Pashto › 普什图语 |
ps | pus | pus | Indo-European › Indo-Iranian |
Perso-Arabic | Asia |
Chewa › 齐切瓦语 |
ny | nya | nya | Niger–Congo › Atlantic–Congo |
Latin Mwangwego Chewa Braille |
Africa |
Twi › 契维语 |
tw | twi | twi | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Japanese › 日语 |
ja | jpn | jpn | Japonic | Mixed scripts of Kanji and Kana Japanese Braille |
Oceania Asia |
Swedish › 瑞典语 |
sv | swe | swe | Indo-European › Germanic |
Latin Swedish Braille |
Europe |
Samoan › 萨摩亚语 |
sm | smo | smo | Austronesian › Malayo-Polynesian |
Latin Samoan Braille |
Oceania |
Serbian › 塞尔维亚语 |
sr | srp | srp | Indo-European › Balto-Slavic |
Serbian Cyrillic Serbian Latin Yugoslav Braille |
Europe |
Seychelles Creole › 塞舌尔克里奥尔语 |
N/A | N/A | crs | French Creole › Bourbonnais Creoles |
Latin | Africa |
Sesotho › 塞索托语 |
st | sot | sot | Niger–Congo › Atlantic–Congo |
Latin Sotho Braille |
Africa |
Sango › 桑戈语 |
sg | sag | sag | Creole | Latin | Africa |
Sinhalese › 僧伽罗语 |
si | sin | sin | Indo-European › Indo-Iranian |
Sinhala Sinhala Braille |
Asia |
Hill Mari › 山地马里语 |
N/A | N/A | mrj | Uralic › Finno-Ugric |
Cyrillic | Europe |
Slovak › 斯洛伐克语 |
sk | slo(B) slk(T) |
slk | Indo-European › Balto-Slavic |
Latin Slovak Braille |
Europe |
Slovenian › 斯洛文尼亚语 |
sl | slv | slv | Indo-European › Balto-Slavic |
Latin Slovene Braille |
Europe |
Swahili › 斯瓦希里语 |
sw | swa | swa | Niger–Congo › Atlantic–Congo |
Latin Arabic Swahili Braille |
Africa |
Scottish Gaelic › 苏格兰盖尔语 |
gd | gla | gla | Indo-European › Celtic |
Scottish Gaelic | Europe |
Somali › 索马里语 |
so | som | som | Afro-Asiatic › Cushitic |
Somali Latin Wadaad writing Osmanya Borama Kaddare |
Africa |
Tajik › 塔吉克语 |
tg | tgk | tgk | Indo-European › Indo-Iranian |
Cyrillic Latin Persian Tajik Braille |
Asia |
Tahitian › 塔希提语 |
ty | tah | tah | Austronesian › Malayo-Polynesian |
Latin | Europe |
Telugu › 泰卢固语 |
te | tel | tel | Dravidian › South-Central |
Telugu Telugu Braille |
Africa Asia |
Tamil › 泰米尔语 |
ta | tam | tam | Dravidian › Southern |
Tamil Tamil-Brahmi Grantha Vatteluttu Pallava Kolezhuthu Arwi Tamil Braille Latin |
North America Africa Asia |
Thai › 泰语 |
th | tha | tha | Kra–Dai | Thai Thai Braille |
Asia |
Tongan › 汤加语 |
to | ton | ton | Austronesian › Malayo-Polynesian |
Latin | Oceania Africa |
Tigre › 提格雷语 |
N/A | tig | tig | Afro-Asiatic › Semitic |
Tigre Arabic |
Africa |
Turkish › 土耳其语 |
tr | tur | tur | Turkic › Common Turkic |
Latin Turkish Braille |
Europe Asia |
Turkmen › 土库曼语 |
tk | tuk | tuk | Turkic › Common Turkic |
Latin Cyrillic Arabic Turkmen Braille |
Europe Asia |
Waray › 瓦瑞语 |
N/A | war | war | Austronesian › Malayo-Polynesian |
Latin | Asia |
Welsh › 威尔士语 |
cy | wel(B) cym(T) |
cym | Indo-European › Celtic |
Latin Welsh Braille |
Europe |
Uyghur › 维吾尔语 |
ug | uig | uig | Turkic › Common Turkic |
Uyghur Uyghur Perso-Arabic Uyghur Cyrillic Uyghur Latin Uyghur New |
Asia |
Udmurt › 乌德穆尔特语 |
N/A | udm | udm | Uralic › Finno-Ugric |
Latin Cyrillic |
Europe |
Urdu › 乌尔都语 |
ur | urd | urd | Indo-European › Indo-Iranian |
Perso-Arabic Roman Urdu Urdu Braille |
Africa Asia |
Ukrainian › 乌克兰语 |
uk | ukr | ukr | Indo-European › Balto-Slavic |
Cyrillic Ukrainian Braille Ukrainian Latin |
Europe |
Uzbek › 乌兹别克语 |
uz | uzb | uzb | Turkic › Common Turkic |
Latin Cyrillic Perso-Arabic Uzbek Braille |
Asia |
Spanish › 西班牙语 |
es | spa | spa | Indo-European › Italic |
Latin Spanish Braille |
North America Africa South America Europe |
Hebrew › 希伯来语 |
he | heb | heb | Afro-Asiatic › Semitic |
Hebrew Hebrew Braille Paleo-Hebrew Imperial Aramaic |
Asia |
Greek › 希腊语 |
el | gre(B) ell(T) |
ell | Indo-European › Hellenic |
Greek | Africa Europe |
Hawaiian › 夏威夷语 |
N/A | haw | haw | Austronesian › Malayo-Polynesian |
Latin Hawaiian Braille |
North America |
Sindhi › 信德语 |
sd | snd | snd | Indo-European › Indo-Iranian |
Arabic Devanagari Roman Sindhi |
Asia |
Hungarian › 匈牙利语 |
hu | hun | hun | Uralic › Finno-Ugric |
Latin Hungarian Braille Old Hungarian |
Europe |
Shona › 修纳语 |
sn | sna | sna | Niger–Congo › Atlantic–Congo |
Latin Arabic Shona Braille |
Africa |
Cebuano › 宿务语 |
N/A | ceb | ceb | Austronesian › Malayo-Polynesian |
Latin Philippine Braille Baybayin |
Asia |
Armenian › 亚美尼亚语 |
hy | arm(B) hye(T) |
hye hyw |
Indo-European | Armenian Armenian Braille |
Europe Asia |
Igbo › 伊博语 |
ig | ibo | ibo | Niger–Congo › Atlantic–Congo |
Latin Nwagu Aneke Igbo Braille |
Africa |
Italian › 意大利语 |
it | ita | ita | Indo-European › Italic |
Latin Italian Braille |
Africa Europe |
Yiddish › 意第绪语 |
yi | yid | yid | Indo-European › Germanic |
Hebrew Latin |
Europe |
Hindi › 印地语 |
hi | hin | hin | Indo-European › Indo-Iranian |
Devanagari Kaithi Roman Devanagari Braille |
Africa Asia |
Sundanese › 印尼巽他语 |
su | sun | sun | Austronesian › Malayo-Polynesian |
Latin Sundanese Old Sundanese Sundanese Cacarakan Sundanese Pégon Buda Kawi Pallava Pranagari Vatteluttu |
Asia |
Indonesian › 印尼语 |
id | ind | ind | Austronesian › Malayo-Polynesian |
Latin Indonesian Braille |
Asia |
Javanese › 印尼爪哇语 |
jv | jav | jav | Austronesian › Malayo-Polynesian |
Latin Javanese Pegon |
Asia |
English › 英语 |
en | eng | eng | Indo-European › Germanic |
Latin Anglo Saxon runes English Braille Unified English Braille |
North America Oceania Africa South America Europe Asia |
Yucatec Maya › 尤卡坦玛雅语 |
N/A | N/A | yua | Mayan | Latin | North America |
Yoruba › 约鲁巴语 |
yo | yor | yor | Niger–Congo › Atlantic–Congo |
Latin Yoruba Braille Arabic |
Africa |
Vietnamese › 越南语 |
vi | vie | vie | Austroasiatic | Latin Vietnamese Braille Chữ Hán and Chữ Nôm |
Europe Asia |
Cantonese › 粤语 |
N/A | N/A | yue | Sino-Tibetan › Sinitic |
Written Cantonese Cantonese Braille Written Chinese |
Asia |
Chinese (Traditional) › 中文(繁体) |
zh | zho(T) chi(B) |
zho | Sino-Tibetan › Sinitic |
Traditional Chinese | Asia |
Chinese (Simplified) › 中文(简体) |
zh | zho(T) chi(B) |
zho | Sino-Tibetan › Sinitic |
Simplified Chinese | North America Africa Asia |
Venda › 文达语 |
ve | ven | ven | Niger–Congo › Atlantic–Congo |
Latin Venda Braille |
Africa |
Achuar › 阿丘雅语 |
N/A | N/A | acu | Jivaroan | Latin | South America |
Aguaruna › 阿瓜鲁纳语 |
N/A | N/A | agr | Chicham | Latin | South America |
Akawaio › 阿卡瓦伊语 |
N/A | N/A | ake | Cariban › Venezuelan Carib |
Latin | South America |
Amuzgo › 阿穆斯戈语 |
N/A | N/A | amu | Oto-Manguean › Eastern Otomanguean |
Latin | North America |
Ndyuka › 恩都卡语 |
N/A | N/A | djk | English Creole |
Afaka Latin |
South America |
Barasana › 巴拉萨纳语 |
N/A | N/A | bsn | Tucanoan › Eastern Tucanoan |
Latin | South America |
Cabecar › 卡韦卡尔语 |
N/A | N/A | cjp | Chibchan › Core-Chibchan |
Latin | North America |
Cakchiquel › 卡克奇克尔语 |
N/A | N/A | cak | Mayan › Quichean–Mamean |
Latin | North America |
Campa › 坎帕语 |
N/A | N/A | cni | Maipurean › Southern Maipurean |
Latin | South America |
Camsa › 科奇语 |
N/A | N/A | kbh | Language isolate |
Latin | South America |
Chamorro › 查莫罗语 |
ch | cha | cha | Austronesian › Malayo-Polynesian |
Latin | North America |
Cherokee › 切诺基语 |
N/A | chr | chr | Iroquoian › Southern Iroquoian |
Cherokee Latin |
North America |
Chinantec › 奇南特克语 |
N/A | N/A | chq | Oto-Manguean › Western Oto-Mangue |
Latin | North America |
Coptic › 科普特语 |
N/A | cop | cop | Afro-Asiatic › Egyptian |
Coptic | Africa |
Dinka › 丁卡语 |
N/A | din | dik | Nilo-Saharan › Eastern Sudanic |
Latin | Africa |
Galela › 加莱拉语 |
N/A | N/A | gbi | West Papuan › North Halmahera |
Latin | Asia |
Jakalteko › 雅加达语 |
N/A | N/A | jac | Mayan › Qʼanjobalan–Chujean |
Latin | North America |
Kiche › 基切语 |
N/A | N/A | quc | Mayan › Eastern Qʼanjobalan–Chujean |
Latin | North America |
Kabyle › 卡拜尔语 |
N/A | kab | kab | Afro-Asiatic › Berber |
Latin Tifinagh |
Africa |
Lukpa › 卢克帕语 |
N/A | N/A | dop | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Mam › 马姆语 |
N/A | N/A | mam | Mayan › Eastern Mayan |
Latin | North America |
Manx › 马恩岛语 |
gv | glv | glv | Indo-European › Celtic |
Latin | Europe |
Nahuatl › 纳瓦特尔语 |
N/A | nah | nhg | Uto-Aztecan › Southern Uto-Aztecan |
Latin | North America |
Ojibwa › 奥吉布瓦语 |
oj | oji | ojb | Algic › Algonquian |
Latin Ojibwe Great Lakes Algonquian |
North America |
Paite › 派特语 |
N/A | N/A | pck | Sino-Tibetan › Kuki-Chin-Naga |
Latin | Asia |
Potawatomi › 波塔瓦托米语 |
N/A | N/A | pot | Algic › Algonquian |
Latin Great Lakes Algonquian |
North America |
Quichua › 盖丘亚语 |
qu | N/A | quw | Quechuan | Latin | South America |
Romani › 罗姆语 |
N/A | rom | rmn | Indo-European › Indo-Iranian |
Latin | Europe |
Shuar › 舒阿尔语 |
N/A | N/A | jiv | Chicham | Latin | South America |
Syriac › 叙利亚语 |
N/A | N/A | syc | Afro-Asiatic › Semitic |
Syriac | Asia |
Berber › 柏柏尔语 |
N/A | ber | ber | Afro-Asiatic | Latin | Africa |
Tachelhit › 希尔哈语 |
N/A | N/A | shi | Afro-Asiatic › North Afroasiatic |
Arabic Latin Tifinagh |
Africa |
Tamajaq › 图阿雷格语 |
N/A | N/A | tmh | Afro-Asiatic › Berber |
Latin | Africa |
Uma › 乌玛语 |
N/A | N/A | ppk | Austronesian › Malayo-Polynesian |
Latin | Asia |
Uspanteco › 乌斯潘坦语 |
N/A | N/A | usp | Mayan › Quichean–Mamean |
Latin | North America |
Wolaytta › 瓦拉莫语 |
N/A | wal | wal | Afro-Asiatic › Omotic |
Latin | Africa |
Wolof › 沃洛夫语 |
wo | wol | wol | Niger–Congo › Atlantic–Congo |
Latin Arabic Garay |
Africa |
Zarma › 哲尔马语 |
N/A | N/A | dje | Nilo-Saharan › Songhay |
Latin | Africa |
Oriya › 奥利亚语 |
or | ori | ori | Indo-European › Indo-Iranian |
Odia Odia Braille |
Asia |
Aceh › 亚齐语 |
N/A | ace | ace | Austronesian › Malayo-Polynesian |
Latin Jawi |
Asia |
Faroese › 法罗语 |
fo | fao | fao | Indo-European › Germanic |
Latin Faroese Braille |
Europe |
Tetun › 德顿语 |
N/A | N/A | tet | Austronesian › Malayo-Polynesian |
Latin | Asia |
Brezhoneg › 布列塔尼语 |
br | bre | bre | Indo-European › Celtic |
Latin | Europe |
Chuvash › 楚瓦什语 |
cv | chv | chv | Turkic › Oghur |
Cyrillic | Europe |
Divehi › 迪维希语 |
dv | div | div | Indo-European › Indo-Iranian |
Thaana | Asia |
Montenegrin › 黑山语 |
N/A | cnr | cnr | Indo-European › Balto-Slavic |
Cyrillic Latin Yugoslav Braille |
Europe |
Dzongkha › 宗喀语 |
dz | dzo | dzo | Sino-Tibetan › Tibeto-Kanauri |
Tibetan Dzongkha Braille |
Asia |
Dyula › 迪尤拉语 |
N/A | dyu | dyu | Mande › Western Mande |
N'Ko Latin Arabic |
Africa |
Northern Kurdish › 北库尔德语 |
N/A | N/A | kmr | Indo-European › Indo-Iranian |
Hawar Sorani Arabic Cyrillic |
Asia |
Manipuri › 曼尼普尔语 |
N/A | mni | mni | Sino-Tibetan › Tibeto-Burman |
Ancient Meitei Meetei Mayek Bengali Latin |
Asia |
Wali › 瓦利语 |
N/A | N/A | wlx | Niger–Congo › Atlantic–Congo |
Latin | Africa |
South Azerbaijani › 南阿塞拜疆语 |
N/A | N/A | azb | Turkic › Common Turkic |
Latin Perso-Arabic Cyrillic Georgian |
Asia |
Ika › 伊卡语 |
N/A | N/A | ikk | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Cañar Highland Quichua › 卡纳尔高地-基丘亚语 |
N/A | N/A | qxr | Quechuan | Latin | South America |
Poqomchi’ › 波孔奇语 |
N/A | N/A | poh | Mayan › Quichean–Mamean |
Latin | North America |
Kuanua › 库阿努阿语 |
N/A | N/A | ksd | Austronesian › Malayo-Polynesian |
Latin Tolai Braille |
Oceania |
Central Ifugao › 中部伊富高语 |
N/A | N/A | ifa | Austronesian › Malayo-Polynesian |
Latin | Asia |
Motu › 摩图语 |
N/A | N/A | meu | Austronesian › Malayo-Polynesian |
Latin Motu Braille |
Oceania |
Cusco Quechua › 库斯科克丘亚语 |
N/A | N/A | quz | Quechuan | Latin | South America |
Marshallese › 马绍尔语 |
mh | mah | mah | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Zotung Chin › 佐通钦语 |
N/A | N/A | czt | Sino-Tibetan › Tibeto-Burman |
Latin | Asia |
Wa › 佤语 |
N/A | N/A | prk | Austroasiatic › Khasi–Palaungic |
Latin | Asia |
Ayangan Ifugao › 阿雅安伊富高语 |
N/A | N/A | ifb | Austronesian › Malayo-Polynesian |
Latin | Asia |
Bambara › 班巴拉语 |
bm | bam | bam | Niger-Congo › Mande |
Latin N'Ko |
Africa |
Northern Mam › 北部马姆语 |
N/A | N/A | mam | Mayan › Eastern Mayan |
Latin | North America |
South Bolivian Quechua › 南玻利维亚克丘亚语 |
N/A | N/A | quh | Quechuan | Latin | South America |
Hawaiian Creole English › 夏威夷克里奥尔英语 |
N/A | N/A | hwc | English Creole |
Latin | North America |
Hakha Chin › 哈卡钦语 |
N/A | N/A | cnh | Sino-Tibetan › Tibeto-Burman |
Latin Burmese |
Asia |
Lomwe › 隆韦语 |
N/A | N/A | ngl | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Kiribati › 基里巴斯语 |
N/A | gil | gil | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Hiri Motu › 希里莫图语 |
ho | hmo | hmo | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Tampulma › 坦普尔马语 |
N/A | N/A | tpm | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Enxet › 恩舍特语 |
N/A | N/A | enx | Mascoian | Latin | South America |
Maranao › 马拉瑙语 |
N/A | N/A | mrw | Austronesian › Malayo-Polynesian |
Latin Arabic |
Asia |
Tedim Chin › 特丁钦语 |
N/A | N/A | ctd | Sino-Tibetan › Tibeto-Burman |
Latin Pau Cin Hau |
Asia |
Aymara › 艾马拉语 |
ay | aym | aym | Aymaran | Latin | South America |
Acateco › 阿卡特克语 |
N/A | N/A | knj | Mayan › Qʼanjobalan–Chujean |
Latin | North America |
Ditammari › 迪塔马利语 |
N/A | N/A | tbz | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Jingpho › 景颇语 |
N/A | N/A | kac | Sino-Tibetan › Sal |
Latin Burmese |
Asia |
Maale › 马勒语 |
N/A | N/A | mdy | Afro-Asiatic › Omotic |
Ethiopic | Africa |
Western Lawa › 西部拉威语 |
N/A | N/A | lcp | Austroasiatic › Khasi–Palaungic |
Thai | Asia |
Sidamo › 锡达莫语 |
N/A | N/A | sid | Afro-Asiatic › Cushitic |
Ethiopic Latin |
Africa |
Bariba › 巴里巴语 |
N/A | N/A | bba | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Izi › 伊兹语 |
N/A | N/A | izz | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Roviana › 罗维那语 |
N/A | N/A | rug | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Dadibi › 达迪比语 |
N/A | N/A | mps | Papuan Gulf |
Latin | Oceania |
Lun Bawang › 弄巴湾语 |
N/A | N/A | lnd | Austronesian › Malayo-Polynesian |
Latin | Asia |
Chechen › 车臣语 |
ce | che | che | Northeast Caucasian › Nakh |
Cyrillic | Europe |
Kapingamarangi › 卡平阿马朗伊语 |
N/A | N/A | kpg | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Western Bukidnon Manobo › 西布基农马诺布语 |
N/A | N/A | mbb | Austronesian › Malayo-Polynesian |
Latin | Asia |
Crimean Tatar › 克里米亚鞑靼语 |
N/A | crh | crh | Turkic › Common Turkic |
Cyrillic Latin |
Europe |
Guajajára › 瓜哈哈拉语 |
N/A | N/A | gub | Tupian › Tupí–Guaraní |
Latin | South America |
Timugon Murut › 蒂穆贡-穆鲁特语 |
N/A | N/A | tih | Austronesian › Malayo-Polynesian |
Latin | Asia |
Lacid › 勒期语 |
N/A | N/A | lsi | Sino-Tibetan › Tibeto-Burman |
Latin | Asia |
Huli › 胡里语 |
N/A | N/A | hui | Engan › South Engan |
Latin | Oceania |
Antipolo Ifugao › 安蒂波洛伊富高语 |
N/A | N/A | ify | Austronesian › Malayo-Polynesian |
Latin | Asia |
Central Dusun › 中部杜顺语 |
N/A | N/A | dtp | Austronesian › Malayo-Polynesian |
Latin | Asia |
Madurese › 马都拉语 |
N/A | N/A | mad | Austronesian › Malayo-Polynesian |
Latin Carakan Pegon |
Asia |
Yom › 约姆语 |
N/A | N/A | pil | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Tuvan › 图瓦语 |
N/A | N/A | tyv | Turkic › Common Turkic |
Cyrillic Old Turkic |
Europe |
Bokobaru › 博科巴鲁语 |
N/A | N/A | bus | Niger–Congo › Mande |
Latin | Africa |
Busa › 布萨语 |
N/A | N/A | bqp | Niger–Congo › Mande |
Latin | Africa |
Achi › 阿奇语 |
N/A | N/A | acr | Mayan › Quichean › Mamean |
Latin | North America |
Mossi › 莫西语 |
N/A | mos | mos | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Nigerian Fulfulde › 尼日利亚富拉语 |
ff | ful | fuv | Niger–Congo › Atlantic–Congo |
Latin Adlam Arabic |
Africa |
Goffa › 果发语 |
N/A | N/A | gof | Afro-Asiatic › Omotic |
Ethiopic Latin |
Africa |
Kasem › 格森语 |
N/A | N/A | xsm | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Eastern Cagayan Agta › 东部卡加延-阿格塔语 |
N/A | N/A | duo | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Shipibo › 西皮沃语 |
N/A | N/A | shp | Panoan › Mainline Panoan |
Latin | South America |
Bola › 波拉语 |
N/A | N/A | bnp | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Ambai › 安拜语 |
N/A | N/A | amk | Austronesian › Malayo-Polynesian |
Latin | Asia |
Yabem › 雅比姆语 |
N/A | N/A | jae | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Numanggang › 努曼干语 |
N/A | N/A | nop | Trans–New Guinea › Finisterre–Huon |
Latin | Oceania |
Yongkom › 永贡语 |
N/A | N/A | yon | Trans–New Guinea › Central & South New Guinea |
Latin | Oceania |
Kalmyk-Oirat › 卡尔梅克卫拉特语 |
N/A | xal | xal | Mongolic › Central Mongolic |
Cyrillic Latin |
Europe |
Tuma-Irumu › 图马伊鲁穆语 |
N/A | N/A | iou | Trans–New Guinea › Finisterre–Huon |
Latin | Oceania |
Siroi › 西罗伊语 |
N/A | N/A | ssd | Trans–New Guinea › Madang |
Latin | Oceania |
Lingao › 临高语 |
N/A | N/A | onb | Kra–Dai › Be–Tai |
N/A | Asia |
Waskia › 瓦吉语 |
N/A | N/A | wsk | Trans–New Guinea › Madang |
Latin | Oceania |
Halbi › 亥比语 |
N/A | N/A | hlb | Indo-European › Indo-Iranian |
Devanagari | Asia |
Nateni › 纳特尼语 |
N/A | N/A | ntm | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Yongbei Zhuang › 邕北壮语 |
N/A | N/A | zyb | Kra–Dai | N/A | Asia |
Bariai › 巴里亚语 |
N/A | N/A | bch | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Bantoanon › 班通安隆语 |
N/A | N/A | bno | Austronesian › Malayo-Polynesian |
Latin | Asia |
Gbaya › 格巴亚语 |
N/A | N/A | krs | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Keliko › 克利科语 |
N/A | N/A | kbo | Nilo-Saharan › Central Sudanic |
Latin | Africa |
Tennet › 腾内特语 |
N/A | N/A | tex | Nilo-Saharan › Eastern Sudanic |
Latin | Africa |
Oroko › 奥罗科语 |
N/A | N/A | bdu | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Bandial › 班迪亚勒语 |
N/A | N/A | bqj | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Tungag › 通加格语 |
N/A | N/A | lcm | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Baka › 巴卡语 |
N/A | N/A | bdh | Ubangian › Sere–Mba |
Latin | Africa |
Suau › 苏奥语 |
N/A | N/A | swp | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Muthuvan › 穆图凡语 |
N/A | N/A | muv | Dravidian | Tamil | Asia |
Pele-Ata › 佩勒-阿塔语 |
N/A | N/A | ata | West New Britain |
Latin | Oceania |
Samberigi › 桑贝里吉语 |
N/A | N/A | ssx | Engan | Latin | Oceania |
Western Bolivian Guarani › 西部玻利维亚瓜拉尼语 |
N/A | N/A | gnw | Tupian › Tupi–Guarani |
Latin | South America |
Sabaot › 萨鲍特语 |
N/A | N/A | spy | Nilo-Saharan › Eastern Sudanic |
Latin | Africa |
Bambam › 邦邦语 |
N/A | N/A | ptu | Austronesian › Malayo-Polynesian |
Latin | Asia |
Tsimané › 齐马内语 |
N/A | N/A | cas | Moseten–Chonan › Chimane |
Latin | South America |
Waris › 瓦里斯语 |
N/A | N/A | wrs | Border › Bewani Range |
Latin | Oceania |
Yipma › 伊普马语 |
N/A | N/A | byr | Trans–New Guinea › Angan |
Latin | Oceania |
Adhola › 阿多拉语 |
N/A | N/A | adh | Nilo-Saharan › Eastern Sudanic |
Latin | Africa |
Agni Sanvi › 阿格尼桑维语 |
N/A | N/A | any | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Ashéninka › 阿舍宁卡语 |
N/A | N/A | cpb | Arawakan | Latin | South America |
Teso › 特索语 |
N/A | N/A | teo | Nilo-Saharan › Eastern Sudanic |
Latin | Africa |
Bari › 巴里语 |
N/A | N/A | bfa | Nilo-Saharan › Eastern Sudanic |
Arabic Latin |
Africa |
Chakma › 查克玛语 |
N/A | N/A | ccp | Indo-European › Indo-Iranian |
Bengali Chakma Latin |
Asia |
Bualkhaw Chin › 布阿尔考钦语 |
N/A | N/A | cbl | Sino-Tibetan › Tibeto-Burman |
Latin | Asia |
Falam Chin › 法兰钦语 |
N/A | N/A | cfm | Sino-Tibetan › Tibeto-Burman |
Bengali Latin |
Asia |
Chiru › 茨鲁语 |
N/A | N/A | cdf | Sino-Tibetan › Tibeto-Burman |
Bengali Latin |
Asia |
Frafra › 法拉法拉语 |
N/A | N/A | gur | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Northern Grebo › 北部格雷博语 |
N/A | grb | gbo | Niger–Congo › Atlantic–Congo |
Latin | Africa |
San Mateo del Mar Huave › 圣马特奥德马尔-瓦维语 |
N/A | N/A | huv | Language isolate |
Latin | North America |
Kakwa › 卡库瓦语 |
N/A | N/A | keo | Puinave-Maku › Northwestern Puinave-Maku |
Latin | Africa |
Kaqchikel › 喀克其奎语 |
N/A | myn | cki | Mayan › Quichean–Mamean |
Latin | North America |
Kaulong › 卡乌龙语 |
N/A | N/A | pss | Austronesian › Malayo-Polynesian |
Latin | Oceania |
Western Kayah › 西部克耶语 |
N/A | N/A | kyu | Sino-Tibetan › Karen |
Kayah Li Latin Myanmar |
Asia |
Kisiha › 斯哈语 |
N/A | N/A | jmc | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Nyakyusa › 尼亚库萨语 |
N/A | N/A | nyy | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Vunjo › 文约语 |
N/A | N/A | vun | Niger–Congo › Atlantic–Congo |
Latin | Africa |
Kulung › 库隆语 |
N/A | N/A | kle | Sino-Tibetan › Mahakiranti |
Devanagari | Asia |
Yi language › 彝语 |
ii | iii | iii | Sino-Tibetan › Lolo–Burmese |
Yi | Asia |
Mongolian › 蒙古语 |
mn | mon | mon | Mongolic | Traditional Mongolian | Asia |
Zhuang language › 壮语 |
za | zha | zha | Kra–Dai › Tai |
Zhuang Old Zhuang Sawndip Sawgoek |
Asia |
Esperanto › 世界语 |
eo | epo | epo | Constructed language |
Latin Esperanto Braille |
N/A |
Abkhaz › 阿布哈兹语 |
ab | abk | abk | Northwest Caucasian › Abkhaz–Abaza |
Cyrillic | Asia |
Aragonese › 阿拉贡语 |
an | arg | arg | Indo-European › Italic |
Latin | Europe |
Algerian Arabic › 阿尔及利亚阿拉伯语 |
N/A | N/A | arq | Afro-Asiatic › Semitic |
Arabic | Africa |
Assamese › 阿萨姆语 |
as | asm | asm | Indo-European › Indo-Iranian |
Eastern Nagari Ahom Assamese Braille Latin |
Asia |
Asturian › 阿斯图里亚斯语 |
N/A | ast | ast | Indo-European › Italic |
Latin | Europe |
Cornish › 康沃尔语 |
kw | cor | cor | Indo-European › Celtic |
Latin | Europe |
Malay trade and creole › 马来语克里奥尔语 |
N/A | crp | N/A | Creole | Latin | Asia Oceania |
Kashubian › 卡舒比语 |
N/A | csb | csb | Indo-European › Balto-Slavic |
Latin | Europe |
Lower Sorbian › 下索布语 |
N/A | dsb | dsb | Indo-European › Balto-Slavic |
Latin | Europe |
Canadian French › 加拿大法语 |
N/A | N/A | N/A | Indo-European › Italic |
Latin | North America |
Middle French › 中古法语 |
N/A | frm | frm | Indo-European › Italic |
Latin | Europe |
Franco-Provençal › 法兰克-普罗旺斯语 |
N/A | N/A | frp | Indo-European › Italic |
Latin | Europe |
Friulian › 弗留利语 |
N/A | fur | fur | Indo-European › Italic |
Latin | Europe |
Guarani › 瓜拉尼语 |
gn | grn | grn | Tupian › Tupi–Guarani |
Guarani Latin |
South America |
Chhattisgarhi › 恰蒂斯加尔语 |
N/A | N/A | hne | Indo-European › Indo-Iranian |
Devanagari Odia |
Asia |
Upper Sorbian › 上索布语 |
N/A | hsb | hsb | Indo-European › Balto-Slavic |
Latin Sorbian |
Europe |
Hupa › 胡帕语 |
N/A | hup | hup | Dené–Yeniseian | Latin | North America |
Interlingua › 因特语 |
ia | ina | ina | Constructed language |
Latin | N/A |
Interlingue › 西方国际语 |
ie | ile | ile | Constructed language |
Latin | N/A |
Ido › 伊多语 |
io | ido | ido | Constructed language |
Latin | N/A |
Jakun › 贾昆语 |
N/A | N/A | jak | Austronesian › Malayo-Polynesian |
Latin | Asia |
Lojban › 逻辑语 |
N/A | jbo | jbo | Constructed language |
Latin | N/A |
Greenlandic › 格陵兰语 |
kl | kal | kal | Eskimo–Aleut | Latin Scandinavian Braille |
North America |
Kanuri › 卡努里语 |
kr | kau | kau | Nilo-Saharan › Saharan |
Latin | Africa |
Kashmiri › 克什米尔语 |
ks | kas | kas | Indo-European › Indo-Iranian |
Perso-Arabic Devanagari Sharada |
Asia |
Lingua Franca Nova › 新通用语 |
N/A | N/A | lfn | Constructed language |
Latin Cyrillic |
N/A |
Limburgs › 林堡语 |
li | lim | lim | Indo-European › Germanic |
Latin | Europe |
Maithili › 迈蒂利语 |
N/A | mai | mai | Indo-European › Indo-Iranian |
Tirhuta Kaithi Devanagari |
Asia |
Mirandese › 米兰达语 |
N/A | mwl | mwl | Indo-European › Italic |
Latin | Europe |
Bokmål › 书面挪威语 |
nb | nob | nob | Indo-European › Germanic |
Latin | Europe |
Low German › 低地德语 |
N/A | nds | nds | Indo-European › Germanic |
Latin | Europe |
Nynorsk › 新挪威语 |
nn | nno | nno | Indo-European › Germanic |
Latin | Europe |
Southern Ndebele › 南恩德贝莱语 |
nr | nbl | nbl | Niger–Congo › Atlantic–Congo |
Latin Ndebele Braille |
Africa |
Northern Sotho › 北索托语 |
N/A | nso | nso | Niger–Congo › Atlantic–Congo |
Latin Sotho Braille |
Africa |
Occitan › 奥克语 |
oc | oci | oci | Indo-European › Italic |
Latin | Europe |
Pampanga › 邦板牙语 |
N/A | pam | pam | Austronesian › Malayo-Polynesian |
Latin Kulitan |
Asia |
Iranian Persian › 伊朗波斯语 |
N/A | N/A | pes | Indo-European › Indo-Iranian |
Perso-Arabic | Asia |
Plateau Malagasy › 高原马达加斯加语 |
N/A | N/A | plt | Austronesian › Malayo-Polynesian |
Latin Malagasy Braille |
Africa |
Brazilian Portuguese › 巴西葡萄牙语 |
N/A | N/A | N/A | Indo-European › Italic |
Latin Portuguese Braille |
South America |
Romansh › 罗曼什语 |
rm | roh | roh | Indo-European › Italic |
Latin | Europe |
Sanskrit › 梵语 |
sa | san | san | Indo-European › Indo-Iranian |
Devanagari Brahmic |
Asia |
Sardinian › 撒丁语 |
sc | srd | srd | Indo-European › Italic |
Latin | Europe |
Northern Sami › 北萨米语 |
se | sme | sme | Uralic › Finno-Ugric |
Latin Northern Sami Braille |
Europe |
Serbo-Croatian › 塞尔维亚-克罗地亚语 |
sh | N/A | hbs | Indo-European › Balto-Slavic |
Latin Cyrillic Yugoslav Braille |
Europe |
Shan › 掸语 |
N/A | shn | shn | Kra–Dai › Kam–Tai |
Burmese | Asia |
Swazi › 斯威士语 |
ss | ssw | ssw | Niger–Congo › Atlantic–Congo |
Latin Swazi Braille |
Africa |
Klingon › 克林贡语 |
N/A | tlh | tlh | Constructed language |
Latin Klingon |
N/A |
Toki Pona › 道本语 |
N/A | N/A | N/A | Constructed language |
N/A | N/A |
Walon › 瓦隆语 |
wa | wln | wln | Indo-European › Italic |
Latin | Europe |
- ISO 639 is a standardized nomenclature used to classify languages. Each language is assigned a two-letter (639-1) and three-letter (639-2 and 639-3) lowercase abbreviation, amended in later versions of the nomenclature[2].
- For several minority languages without official Chinese name, we consult to other reliable sources[3] or use its transliteration name.
Corpora | Type | Language | Detail | Domain |
---|---|---|---|---|
DGT | Multilingual Parallel | bg cs da de el en es et fi fr ga hr hu it lt, etc. |
25 languages, 299 bitexts, 113.52M sents. |
Law |
CCAligned | English at core | af ak am ar as ay az be bg bm bn br bs ca cb, etc. |
113 languages, 112 bitexts, 2.25G sents. |
Web Document |
News-Commentary | Multilingual Parallel | ar cs de en es fr hi id it ja kk nl pt ru zh |
15 languages, 109 bitexts, 2.97M sents. |
News Commentaries |
Europarl | Multilingual Parallel | bg cs da de el en es et fi fr hu it lt lv nl, etc. |
21 languages, 211 bitexts, 30.32M sents. |
Parliament Proceedings |
wikimedia | Multilingual Parallel | ab ace ady af ak am an ang ar arc ary arz as ast atj, etc. |
306 languages, 2,575 bitexts, 31.62M sents. |
Mixed |
EuroPat | Multilingual Parallel | de en es fr hr no pl |
7 languages, 21 bitexts, 143.74M sents. |
Patent |
WikiMatrix | Multilingual Parallel | an ar arz as az azb ba bar be bg bn br bs ca ceb, etc. |
86 languages, 1,620 bitexts, 300.27M sents. |
Wikipedia Article |
UNPC | Multilingual Parallel | ar en es fr ru zh | 6 languages, 15 bitexts, 172.04M sents. |
Parliamentary Records |
MultiParaCrawl | Multilingual Parallel | bg ca cs da de el es et eu fi fr ga gl ha hr, etc. |
40 languages, 669 bitexts, 505.48M sents. |
Mixed |
TildeMODEL | Multilingual Parallel | bg cs da de el en es et fi fr hr hu is it lt, etc. |
30 languages, 274 bitexts, 62.44M sents. |
Mixed |
Tatoeba | English at core | ab acm ady af afb afh aii ain ajp akl aln alt am an ang, ect. |
366 languages, 3,632 bitexts, 9.52M sents. |
Oral |
SETIMES | Multilingual Parallel | bg bs el en hr mk ro sq sr tr |
10 languages, 45 bitexts, 17.60M sents. |
Official Documents |
Wikititles | Multilingual Parallel | ar bg cs da de el en es fa fi fr hu it ja ko, etc. |
23 languages, 506 bitexts, 24.25M sents. |
Title |
OpenSubtile | Multilingual Parallel | af ar bg bn br bs ca cs da de el en eo es et, ect. |
62 languages, 1,782 bitexts, 3.35G sents. |
Subtitles |
XNLI | Multilingual Parallel | fr es de el bg ru tr ar vi th zh hi sw ur en |
15 languages, bitexts 105, 1.5 M sents. |
Mixed |
stanford | English at core | cs de vi | 3 languages, 3 bitexts, 20.3M sents. |
Mixed |
Um-Corpus | Bilingual Parallel | en zh | 2.0M sents. | Mixed |
ASPEC | Bilingual Parallel | en ja | 3.0M sents. | Paper Abstract |
EVB | Bilingual Parallel | en vi | 10.0M sents. | Book |
IIT | Bilingual Parallel | en hi | 1.6M sents. | Mixed |
-
CCMT China Conference on Machine Translation (CCMT) , formerly known as China Workshop on Machine Translation (CWMT), is a flagship conference of machine translation in China.Its evaluations focus mainly on Chinese, English and domestic minority languages (Mongolian, Tibetan, Uyghur, etc.) in domains of news, spoken languages, governmental documents, etc. In addition, CCMT publishes all evaluation-related data on line.
WMT WMT is hosted by Special Interest Group for Machine Translation (SIGMT) annually since 2006. WMT evaluation campaigns focuses on languages between English and over ten languages such as English, German, Finnish, Czech, Romanian, Polish, Russian, etc. in domains of news, information technology, biomedicine. WMT publishes all evaluation resources specific to each evaluation task, you can find it at Shared Task-Provided Data.
NIST The NIST machine translation evaluation started in 2001 as part of the DARPA TIDES program. The evaluations are driven and coordinated by NIST as NIST OpenMT. In the early days, NIST evaluations mainly evaluated the translation performance from languages such as Arabic and Chinese to English. Furthermore, NIST has begun to evaluate low-resource language technologies since 2016. Results of past MT evaluations as well as resources specific to each evaluation can be accessed via the year-specific links.
IWSLT The International Conference on Spoken Language Translation (IWSLT), which has been held annually since 2004, is also a distinctive evaluation campaign on spoken language translation. The test data includes multilingual subtitles of TED talks and QED lectures. The languages involve English, French, German, Czech, Chinese, Arabic and many other languages. Evaluation-related resources can be accessed at Shared Tasks-Training and Development Data.
WAT The Workshop on Asian Translation (WAT) is a new open evaluation campaign focusing on Asian languages. The successive 8 workshops has been successfully jointly held by the Japan Science and Technology Agency (JST), the National Institute of Information and Communications Technology (NICT) and other institutions. WAT focuses on translation from mainstream Asian languages (Chinese, Korean, Hindi, etc.) and English to Japanese in comprehensive domains such as academic papers, patents, news and recipes. Datasets for evaluation can be accessed at Translation Task-Dataset via the year-specific links.
-
↑ "How many languages are there in the world". Ethnologue: Languages of the World. Retrieved 29 April 2021.
-
↑ List of ISO 639-1 codes. Wikipedia.org. Retrieved 29 April 2021.
-
↑ Numeral Systems of the World's Languages Retrieved June 2020.
-
World language and its name, language family, writing systems and the number and distribution of its native speakers. Wikipedia.org. Retrieved March 21, 2020.
-
Glottolog report for language families. Glottolog.org. Retrieved July 21, 2020.
-
World Continents classification Retrieved October 21, 2020.
-
SIL report for language and its codes. Retrieved March 20, 2021.
-
Ethnologue report for language names, codes, writing systems and etc. Retrieved October 11, 2020.
-
Wikidata report for language codes and writing systems. Retrieved May 10, 2021.
-
Omniglot report on writing systems by language. Retrieved May 10, 2021.
-
Scriptsource report on language features, writing systems, spoken countries and its status. Retrieved May 10, 2021.