Skip to content

Strings and Localization

Brian Clifton edited this page Sep 2, 2021 · 57 revisions

Most strings are provided as resources, so that they may be translated to the many languages that Brave supports.

They can be split in to two categories: those that are provided and referenced only from Brave code, and those that are provided and referenced from within Chromium code.

Brave-only strings

All brave-only strings are referenced from one of the files listed in lib/l10nUtil.js in braveNonGeneratedPaths.

Modifications and additions should be made directly to one of the files listed there. Translated versions of modified or added strings will only happen once per release cycle.

Chromium strings

Any chromium strings we do not want to modify

No action necessary. Original and translated strings from chromium src .grd, .grdp and .xtb files are used in-place.

Any chromium strings we want to modify for brave

For each chromium release, the chromium GRD and GRDP files that we wish to make modifications to are copied to a file inside the brave-core source. Some mappings include:

Chromium path Brave path
chrome_app_chromium_strings.grd app/brave_strings.grd
chrome_app_generated_resources.grd app/generated_resources.grd
chrome_app_bookmarks_strings.grdp app/bookmarks_strings.grdp
  • If the GRD or GRDP is not yet present by Brave-Core:
    • Add a path to brave-browser/lib/l10nUtil.js
    • Add an entry to chromiumToAutoGeneratedBraveMapping map
    • Add a mapping to get_original_grd function in brave/script/lib/transifex.py
  • Add any whole-file string replacement rules to the rebaseBraveStringFilesOnChromiumL10nFiles function
    • Add any specific xml transforms to chromium-rebase-l10n.py
  • In Brave-Browser, run npm run chromium_rebase_l10n
    • This will output a new or modified .grd[p] file which should be committed.
    • _override.grd files are generated automatically (please don't make edits to them). There are a series of regex replacements done (ex: Chrome => Brave) and these _override.grd files will get overwritten each time.

The modified string will then get translated when the next release train visits the localization station!

Modifying English strings invalidates all translations

Modifying English source strings in GRD files will invalidate all translations of that string, since the translations reference the original string hash. All users will then only see the English fallback string until the next translation process is performed during a release cycle.

Language files

These are .xtb files containing xml elements of the form:

<translation id=“[number]”>[Translated Text]</translation>

The Translation ID refers to the unique fingerprint of the original string as it appears in the source GRD file. This means that multiple strings that have the exact same English value in the source GRD file will only get translated once per language per GRD file.

Transifex

Strings are matched using filename (not path). This is something we control in transifex.py. Within each file, each unique string from a GRD is translated and pulled down. It is then stored in a corresponding .xtb file with the translation ID (unique fingerprint of the original string).

GRD Tips

  • Provide a descriptive desc=“” attribute on each <message attribute informing the translator of the context for the string.
    • Mention if it should be title case or not
    • Specify which Proper Nouns should not be translated.
  • Use translateable="false" if the whole string should not be translated.
  • Use <if expr=“”> to inform the compiler which string to select. This is useful for platform variations or platform-specific strings. For example:
    • Strings that should have a different case on different systems such as title-case for macOS menus and lower-case for menus on other platforms (<if expr="use_titlecase">)
    • Strings that only appear on a single platform, such as Android (<if expr="is_android">)

Information for submitting localization orders:

  • Login to Transifex and navigate to the Brave dashboard https://www.transifex.com/brave/brave/dashboard/
  • Click Order
  • You can pick from 3 providers gengo, TextMaster, and e2f
  • You will repeat this process 3 times, one for each provider. Start with gengo.
  • Some services have a "Tone" to translate in, usually I select something like Technical and Software. This screen also allows you to write information to the localizer. Simply put this in that box:
Please read this!

https://github.com/brave/brave-browser/wiki/Strings-and-Localization#information-for-localizers
  • Press Next and it should take a minute or two to load, I think this happens because we have a lot of strings.
  • Uncheck All, then only select these resources for translation, the rest are taken from Chromium but we just store them in Transifex:
    • android_brave_strings,
    • brave_components_resources,
    • brave_extension,
    • brave_generated_resources,
    • ethereum_remote_client_extension,
    • rewards_extension, and
    • all *_override files.
    • Leave the rest unchecked!
  • The officially supported languages to order are: af, am, ar, as, be, bg, bn, bs, ca, cs, da, de, el, en-GB, es, es-419, et, eu, fa, fi, fil, fr, fr-CA, gl, gu, hi, hr, hu, hy, id, is, it, he, hr-Latn, ja, ka, kk, km, kn, ko, ky, lo, lt, lv, mk, ml, mn, mr, ms, my, ne, nl, no, or, pa, pl, pt-BR, pt-PT, ro, ru, si, sk, sl, sq, sr, sv, sw, ta, te, th, tr, uk, ur, uz, vi, zh-CN, zh-HK, zh-TW, zu
  • Here are the things we currently select:
# Gengo

Arabic        (ar)
Bulgarian     (bg)
Catalan       (ca)
Chinese-TW    (zh-TW)
Czech         (cs)
Danish        (da)
Dutch         (nl)
Finnish       (fi)
French        (fr)
French-CA     (fr-CA)
German        (de)
Greek         (el)
Hebrew        (he)
Hindi         (hi)
Hungarian     (hu)
Indonesian    (id)
Italian       (it)
Japanese      (ja)
Korean        (ko)
Malay         (ms)
Norwegian     (no)
Polish        (pl)
Portuguese-BR (pt-BR)
Romanian      (ro)
Russian       (ru)
Serbian       (sr)
Slovak        (sk)
Spanish       (es)
Swedish       (sv)
Thai          (th)
Turkish       (tr)
Ukranian      (uk)
Vietnamese    (vi)

# TextMaster
Portuguese-PT (pt-PT)

# e2f

Amharic    (am)
Bengali    (bn)
Chinese-CN (zh-CN)
Croatian   (hr)
English-GB (en-GB)
Estonian   (et)
Filipino   (fil)
Galician   (gl)
Kannada    (kn)
Latvian    (lv)
Lithuanian (lt)
Persian    (fa)
Slovenian  (sl)
Spanish-LA (es-419)
Swahili    (sw)
  • The total cost is usually in the hundreds to thousands range, but usually below 10k. If it's above 10k then please get special approval and make sure everything is being selected correctly.

Information for localizers:

  • Entities should be encoded like for example: <b>test</b> would be &lt;b&gt;test&lt;/b&gt;. Note that entities start with an ampersand, and they end with a semicolon. There is no space in between any of that. Sometimes, you may encounter incorrectly double-escaped encodings like &amp;lt; (instead of &lt;); it's ok to leave those as they are.

  • Sometimes strings have variables (placeholders) which look like the graphic below.
    screen shot 2018-12-18 at 2 49 57 pm
    In these cases you should:

    • use < and > and not &lt; and not &gt; with ph and ex elements (and only those elements!). Even though the source looks like &lt;ph, the translation should use < and >.
    • The text inside ex should not be included in the translation - it's an example of values that would be used in this variable.
    • In the example in the graphic above, the translation should look like this:
      Source: &lt;ph name="EXTENSION_NAME"&gt;$1&lt;/ph&gt; (extension ID "&lt;ph name="EXTENSION_ID"&gt;$2&lt;ex&lt;abacabadabacabaeabacabadabacabaf&lt;/ex&lt;&lt;/ph&gt;") is not allowed in Brave.
      Translation: <ph name="EXTENSION_NAME">$1</ph> (translate this "<ph name="EXTENSION_ID">$2<ex>do not translate this</ex></ph>") translate this too.
    • You may encounter other encoded tags inside the ph tags: for example, &lt;ph name="BEGIN_BOLD1"&gt;&amp;lt;b1&amp;gt;&lt;/ph&gt;. These encoded tags inside ph tags should be left encoded, in this case resulting in:
      Source: &lt;ph name="BEGIN_BOLD1"&gt;&amp;lt;b1&amp;gt;&lt;/ph&gt;
      Translation: <ph name="BEGIN_BOLD1">&amp;lt;b1&amp;gt;</ph>
  • Sometimes strings support single/plural versions. Such strings support ICU plural rules and look similar to this:

    {COUNT, plural,
              =0 {Open all in &amp;private window}
              =1 {Open in &amp;private window}
              other {Open all ({COUNT}) in &amp;private window}}
    Only the bolded text in the above example should be translated. Terms COUNT (or any other term after the initial '{'), plural, other should NOT be translated.
  • If you see &#8217; in the text, that is an encoded apostrophe(). For example, you&#8217;re is just you're in which case you don't need to preserve this encoded sequence in the translation.
    Other common codes:

    • &#8211; - en dash()
    • &#8212; - em dash()
    • &#8216; - opening single quote()
    • &#8230; - ellipsis()
    • Note: standalone &amp; (or &amp;amp;), &lt; (or &amp;lt;), or &gt; (or &amp;gt;), if needed in the localized string, should be left as they are. For example, Bob &amp; Jim, or System &gt; Settings should stay that way.
  • Do not translate branded feature names, such as "Brave", "Brave Rewards", and "Brave Ads".

  • Do not translate terms inside double square or squiggly brackets - they are placeholders (for example [[user]] or {{user}} should be left as is).

When to submit orders and what to do when they're done

For Desktop, the ideal time to submit an order would be 1 week before the release. Per the release schedule, this lines up with the migration date. We have submit orders with 4 days left before a release, but that's very tight. The translations themselves take a while to complete.

When submitting, we can submit the strings for master. When those are complete, we'd pull them into master and then look at back-porting the strings. If release channel is 1.2 and nightly is 1.5, we'd submit uplifts for 1.4 back to 1.2 with the string changes. This can cause problems if the names of the strings (the key) change.

Once translations are ordered, we should make every effort to NOT make string changes to the product. The strings in the product should be considered frozen by reviewers.

Localizing extensions

Ethereum Remote Client

These steps should be done any time we have new strings and any time we rebase on top of MetaMask.

Pushing changes to Transifex:

npm run push_l10n -- --extension=ethereum-remote-client

Pulling changes from Transifex:

npm run pull_l10n -- --extension=ethereum-remote-client

A rule for developing w/ strings for this extension: Never change a MetaMask source string text, instead change the l10n ID that's being used and add the string into brave/app/_locales/en/messages.json

The official supported locales are the same as in Brave and are listed above here: https://github.com/brave/brave-browser/wiki/Strings-and-Localization#information-for-submitting-localization-orders When making an order you want to select both of these projects:

  • ethereum_remote_client_extension
  • brave_ethereum_remote_client_extension

Note, they don't fully match the list from MetaMask.

MetaMask supports hn, ht, and ph that Brave does not. This is changing but it uses tml for ta as well.

Clone this wiki locally