Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CLDR for language list source #2457

Closed
kmilos opened this issue Nov 28, 2014 · 9 comments
Closed

Use CLDR for language list source #2457

kmilos opened this issue Nov 28, 2014 · 9 comments
Assignees
Labels
chore-dependency Improvements to one of iD's dependencies field An issue with a field in the user interface localization Adapting iD across languages, regions, and cultures
Milestone

Comments

@kmilos
Copy link

kmilos commented Nov 28, 2014

Currently iD only supports the 'name:sr' tag in the locale dropdown list, and wrongly labeled as 'Српски / Srpski' at that.

This locale is not intended to be used with both scripts, but Cyrillic only, and should be labeled as 'Српски' only, or 'Serbian (Cyrillic)'.

For Serbian written in Latin script, we have a separate locale 'name:sr-Latn' which should be added to iD dropdown as 'Srpski' only, or 'Serbian (Latin)'.

@bhousel
Copy link
Member

bhousel commented Dec 2, 2014

I guess these come from data/wikipedia.json

Which seems to match what Wikipedia does:
http://meta.wikimedia.org/wiki/List_of_Wikipedias

@kmilos
Copy link
Author

kmilos commented Dec 3, 2014

Thanks, I understand where this came from, but it is not a good way to make a distinction between a language (spoken word) and a locale (data written down) [1]. IMHO, Wikipedia painted itself in a corner wrt i18n and L10n with their early design misconception that language=locale.

See BCP 47 [1][2].

[1] http://icu-project.org/repos/icu/icuhtml/trunk/design/language_code_issues.html (note that RFC 3066 has been superseded by RFC 5646 since)
[2] http://www.w3.org/International/questions/qa-choosing-language-tags

@jfirebaugh
Copy link
Member

Indeed, the list of languages is sourced from Wikipedia. @kmilos if you know of a better source, we can probably switch. We need something that provides English language name, native language name, and language code.

@kmilos
Copy link
Author

kmilos commented Dec 22, 2014

@jfirebaugh
A good source for locale data is CLDR: http://www.unicode.org/cldr/charts/latest/supplemental/locale_coverage.html

Not the use of underscore as delimiter, whereas we've went with sr-Latn as recommended in BCP 47. They also recommend any parsers support both - and _

@1ec5
Copy link
Collaborator

1ec5 commented Jun 8, 2015

The wikipedia tag's value specifies a Wikipedia edition, not necessarily a language. Other examples of mismatches include zh-min-nan instead of the ISO code nan and simple, which doesn't correspond to any ISO language code. So it isn't a great source for language codes.

@tmcw
Copy link
Contributor

tmcw commented Apr 16, 2016

@1ec5 is CLDR the answer, then?

@1ec5
Copy link
Collaborator

1ec5 commented Apr 17, 2016

CLDR would work, certainly, but it’s a very large package that contains much more than language names. Someone may’ve already created a subset of it specifically for this use case.

@bhousel bhousel added localization Adapting iD across languages, regions, and cultures field An issue with a field in the user interface labels Oct 9, 2016
@bhousel bhousel changed the title Missing Serbian (Latin) locale for object name tag in dropdown Use CLDR for language list source Oct 9, 2016
@bhousel
Copy link
Member

bhousel commented Oct 9, 2016

Just a quick followup to this issue that it's now easy to pull in just the parts of CLDR that we want. It has been split into subpackages, automatically converted from xml to json, and published on npm:
https://github.com/unicode-cldr/cldr-json#package-organization

@1ec5
Copy link
Collaborator

1ec5 commented Dec 27, 2017

Per #4632, we’ll need to introduce some iD-specific overrides to support name:* tags that are qualified with an ISO 15924 or ad-hoc script code, such as name:sr-Latn or name:zh-hant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore-dependency Improvements to one of iD's dependencies field An issue with a field in the user interface localization Adapting iD across languages, regions, and cultures
Projects
None yet
Development

No branches or pull requests

6 participants