Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document mapping between language names and codes #984

Open
dan-zeman opened this issue Oct 31, 2023 · 0 comments
Open

Document mapping between language names and codes #984

dan-zeman opened this issue Oct 31, 2023 · 0 comments
Assignees
Milestone

Comments

@dan-zeman
Copy link
Member

@Stormur said in another issue:

As an aside, I report the fact that sometimes it is quite difficult to make out or find the correspondence between ISO codes and languages in UD, so maybe an indexing by ISO codes, a list of correspondences, and the specification of the code also on the various pages pertaining to that language (and on the main page) would be very welcome.

I agree that it would be useful to have this somewhere on the website, ideally autogenerated from the database that underlies the UD infrastructure and updated at release time. It could list all languages currently known to the system, regardless whether they already have a treebank or a language-specific documentation page.

Note that if a treebank of the language appears on the home page (either as released in the past or as planned for the future), the language code used by UD can be seen when you click on the language name, then on the treebank name, then inspect the URL of the link "Treebank hub page"; for example, for the Cappadocian treebank (not yet released) the URL is https://universaldependencies.org/treebanks/cpg_amgic/index.html, meaning that the langauge code is cpg. If the language already has language-specific documentation, then it can be accessed from the guidelines page and again, its URL reveals the language code. Really all languages known to the system can be seen on the pages where features, relations and auxiliaries are registered for the validator (e.g. here for auxiliaries); language code is in each URL.

But this way is overcomplicated and it does not provide a good solution for the opposite search, from a code to a language name.

@dan-zeman dan-zeman added this to the v2.13 milestone Oct 31, 2023
@dan-zeman dan-zeman self-assigned this Oct 31, 2023
@dan-zeman dan-zeman modified the milestones: v2.13, v2.14 Nov 15, 2023
@dan-zeman dan-zeman modified the milestones: v2.14, v2.15 May 15, 2024
@dan-zeman dan-zeman modified the milestones: v2.15, v2.16 Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant