Skip to content

Localization with Crowdin

Atul Varma edited this page Aug 11, 2021 · 4 revisions

The codebase for the Tenant Platform is agnostic to any particular localization tools. As described in the README, almost all of it uses open standards like PO files and ICU MessageFormat. However, in practice we use Crowdin at JustFix.

The localization workflow

Our localization workflow currently has the following rough outline:

  1. If we're augmenting existing functionality that needs to be deployed and translated ASAP, we build-in internationalization from the outset (that is, we use gettext in Python code and/or Lingui's various React macros/components in the frontend).

    On the other hand, if we are building new functionality whose strings aren't yet ready to be translated (perhaps because they are not yet vetted by a content team), we don't build-in internationalization from the outset. This is because whenever we merge anything into the main branch, any new strings will immediately show up in Crowdin as being ready for translation--but we don't want translators needlessly translating strings that might soon be replaced. For more details on how best to internationalize the new functionality, see the section on internationalizing existing code below.

  2. Once strings ready for translation are internationalized and merged into the main branch, they will show up in Crowdin for translation.

    During this interim period, in which the main branch contains untranslated strings, try not to push the main branch to production--or if you do, then do so with the understanding that some strings will appear only in English, even if the user is on a different language.

    Once the strings are translated, Crowdin will automatically create a PR containing the new translations. The PR will be called "New Crowdin updates". Squash-merge this PR into the main branch.

    In some situations, this PR will have merge conflicts with the main branch, but we can resolve them by simply replacing the main branch's translations with the ones in Crowdin's PR (in this sense we always assume Crowdin has the "right" translations). A Python script to automate this process is in #1724.

  3. Once the translations are merged into the main branch, you can push it to production.

Internationalizing existing code

If you're internationalizing a large chunk of existing English-only code, you may want to garble the message catalogs to assist in ensuring that you've properly internationalized all the strings.

Translating long front-end strings

Really long strings in the front-end are referenced by a short id, which helps us save space in our message catalogs. Here's an example:

<Trans id="evictionfree.postOfficeFaq">
  No, you can use this website to send a letter to your landlord via
  /* ... Lots more text ... */
</Trans>

One unfortunate downside of this, however, is that whenever we change the content, we also need to change the id, or else Crowdin won't notice that the new content needs to be translated. One easy way to do this is just by adding/incrementing a version number at the end of the id.

Translating PO files outside of Crowdin

In some cases, it may be easier to provide translations by editing a PO file directly (or using another third-party tool) rather than via Crowdin. If you do this, though, you will also need to upload the PO file with your translations to Crowdin, as it doesn't read from the existing translations in the repository.

Splitting message catalogs

We have a mechanism for splitting our message catalogs into multiple files in a way that ensures that the translations for all our functionality across all our websites served by the Tenant Platform don't need to be sent to the user for each page load. It's a bit analogous to code splitting for JS, but it's slightly more primitive. For more details on the implementation, see #1407. Also see #1800 for an example of a new split message catalog being added to the site.

Translations outside of PO files

There's a few places where we might use other strategies outside of PO files/Crowdin to translate text.

For example, with NoRent's individual state-level KYR blurbs, we decided to use an Airtable to hold both the English and Spanish versions of each blurb. In some other places, we simply hard-coded the Spanish version of some markup in JSX and displayed it conditionally. And common content that appears across all our sites is retrieved from Contentful.

The reasons for doing this are myriad and out of the scope of this wiki page--for now, just be aware that not all translations exist in Crowdin, and depending on the context, new content you add doesn't have to be translated in Crowdin either.

A good rule of thumb, though, is that if it's the text of a UI element like a button or form label, or if it's text that needs to change dynamically based on some data, it should probably be in a PO file that's translated in Crowdin--but otherwise, you're welcome to use whatever strategy feels the most appropriate.