-
Notifications
You must be signed in to change notification settings - Fork 40
Key based i18n vs default language i18n #50
Comments
What is the context of your question? Like,
|
It's less of an issue when it comes to document where there is the original language it was written but it is very relevant when it comes to translating UIs and templates. |
@LaurentGoderre so far we've been moving towards managing i18n based on locale rather than jumping from a default language–much like in this example from the Electron project. This seems to be the logical move to make for easy correlation between our Crowdin projects and i18n module repo. For any Node.js UI that imports this module–the user's locale selection will determine which translation to render, with any untranslated text gracefully falling back to English. However, that brings up the really interesting point that I think you're getting at. Does every bit of l10n for a document or template need to be translated directly from English? I'm pretty sure we can safely assume that there are certain wordings and phrases that would be better off rewritten in another language than translated directly! It looks like we've already garnered concern for this with awesome people who care about it! I'm currently of the opinion that for the sake of both pragmatism & empathy over time–we can do a hybrid. Translating most content directly from English will be the quickest way to achieve i18n at scale, but letting all (English) Node.js content be subject to native-l10n will be more effective for everyone. With this–I don't see a way around having careful, case-by-case conversations with each l10n group moving forward. We'd need to provide a clear directive for all l10n groups so they can feel free to propose generating textual content in their own wording to the i18n WG. This would also probably require us to maintain a process for understanding what each native-l10n translation is communicating–and to verify it with the maintainers of the affected source before merging (eg. API docs, Node.js website, etc). Thank you so much for bringing this up! This is a sizable oversight! 🍻 |
Can you provide more context to this, and how it might affect us? Thanks a bunch! 🙂 |
@nodejs/i18n native-l10n is an important consideration that I think we've been overlooking^ |
I think there are sets of issues here. The one in the original issue is (if I can restate it, @LaurentGoderre ) choosing one of: Plan A: Source content is written in some language (English) without regard to translation.Example content: console.log('Hello World'); Example translated model: {
"Hello World": "Hola Mundo"
} Updated code: console.log(CONVERT_TO_FOREIGN_LANGUAGE("Hello World")); Plan B: Use keysExample: {
"es": { "greeting": "Hola Mundo" },
"en": { "greeting": "Hello World" }
} console.log(FETCH_THIS("greeting")); Plan A is usually adopted because translation is a 'retrofit' and/or we don't want to bother the coders with the detail of these languages. (I use 'foreign' sarcastically here.) It has many, many problems:
Like daylight savings time/summer time, Plan A is a bad idea, but extremely popular. (yes, I have opinions here.) Plan B usually encounters resistance because it seems complex upfront. But really, it just means thinking carefully about what you are presenting to users. Besides the bias issue of whether non-Source language people are first or second class users, you kind of have the issue of separating out the logic from the content. See for example node’s own error system nodejs/node#11220 — instead of people comparing the strings of error messages, the keys become a natural way to work with the errors semantically. The second set of issues has to do with English vs. Non English. In my mind, the source language (which is why i say source language instead of English whereever possible) is a development team decision. The 'default' language doesn't have to be English, and it sometimes isn't in practice even if the source language is so. Let's say a French national company, the default language in absence of other information is likely to be … French. Ideally tools and processes are designed to NOT hardcode English or treat it specially. Yes I have been known to hardcode Just my ¤2.00 (<<< Substitute your appropriate currency here) |
Wow! thanks @srl295 for explaining so well the rationale I have been putting off writing! |
On the Electron project, we use a combination of approaches. For localized strings on Electron's website, we use a locale.yml file with arbitrarily named keys. These keys are referred to in the website's HTML templates like The rest of the Electron website's translated content lives in markdown files from the electron/electron repo. We send these documents to Crowdin in their entirety. We initially tried to avoid accidental translation of untranslatable strings like Because whole markdown files are now exposed to translators, there is more of a risk of certain content being translated that should be really left alone, code snippets being the most prominent example. We've taken a few measures to help avoid confusion about what should and shouldn't be translated:
|
@zeke @LaurentGoderre @srl295 thanks for your terrific insights! 🙌
Maintaining a close relationship with individuals in designated proofreader roles for each l10n team is going to be necessary in order to help us isolate which strings need their own source-localization. The concept of 'proofreader' & 'translator' roles is something we should probably bake into our new l10n guidelines so we can get the ball rolling with that.
Maybe we can run a script in the CI that injects warnings for translators as comments above any string that contains reserved terms in the markdown files before they migrate to Crowdin. That way we wouldn't have to maintain a glossary to reference while translating in Crowdin, but only our own list of terms that shouldn't be translated. eg.
This seems like a straightforward way to do this, and maybe we can extend it with a locale-key based approach similar to @srl295's eg. It might also be beneficial to make this a meta-process that will cover multiple i18n initiatives (eg. API docs, Web Site, etc). We might opt for adding a project key as well. eg.
Pragmatically, I think that making English the working source language is going to help us achieve i18n at scale the quickest, given that the alternative (though more ideal) may possibly require refactoring efforts in existing Node.js source to accommodate the kind of templating needed to support an absence of a 'root' language (please correct me if I'm wrong). If we were able to determine somehow that we wouldn't be asking very much of core maintainers & other initiatives, it might be a rad way to go. Granted though, Node.js has a lot of contributors so it might not actually be that painful.
Yes! 👍We'll need to rely on our l10n teams to inform us of when they think their versions are better than the original source. Here are a couple ways I think we could handle this:
|
Have you considered using something called "Engineering English"? It's the best of both worlds (keys and English strings). Concept is that you define a default language called "Engineering English" where the engineers write the key using english to convey the meaning. It reads well. But to avoid changing keys that change the the string, those keys are fixed. If you want to change the english translation you override it in the translation file. For you might have:
but then you want it to say "Greetings Earth" instead so you would do this:
You keep the key the same since it still is the same concept to an engineer but the output is different. |
I like that idea, @oppianmatt. But I would favor using key strings that lend themselves to easy addressability in JS and in templating languages. Spaces make things tricky, so instead of |
This issue should be closed, because we're not using any of this approaches. |
@alexandrtovmach what approach do you use in the end? |
There usually is two camps when it comes to manage i18n. Using a default language (usually english) or using keys (also usually in english but they are more like variables).
Has there been a decision on which one to use already?
The text was updated successfully, but these errors were encountered: