Skip to content
cld2 edited this page Jul 28, 2015 · 2 revisions

CLD2 detects 175 language-script combinations in Unicode text. I wanted to make a nice chart showing all the languages, showing something about how statistically-close the various language are -- the detector has an easy time with languages that are quite distinct and a hard time distinguishing languages that are quite close.

See the full writeup at
https://docs.google.com/document/d/1NtErs467Ub4yklEfK0C9AYef06G_1_9NHL5dPuKIH7k/edit

Clone this wiki locally