Skip to content

Conversation

pzerelles
Copy link
Contributor

This is a prototype for build-in i18n support. Ideas were taken from #553.

Routing uses a locale prefix and the default locale can optionally have no prefix. Translations are compiled to JavaScript code and plurals currently use the banana format.

There is a simple demo in the 'examples/svelte-kit-i18n-demo'.

The path I have taken to implement this is probably not the most elegant solution and I am looking forward to other possibilities from experienced contributors.

@benmccann
Copy link
Member

My personal preference would be to update the existing demo rather than creating a new one

Also, I haven't gotten a chance to take a look at this but there was a lot mentioned in the original ticket. Did you implement all of it or just a part of it? If you implemented multiple parts of it I think it would be best to do one PR for each part to make it easier to review

@pzerelles
Copy link
Contributor Author

pzerelles commented Apr 16, 2021

My personal preference would be to update the existing demo rather than creating a new one

The i18n demo is based on the regular demo, so we could easily do that.

Also, I haven't gotten a chance to take a look at this but there was a lot mentioned in the original ticket. Did you implement
all of it or just a part of it? If you implemented multiple parts of it I think it would be best to do one PR for each part to make it easier to review

I implemented most of it, but the original ticket has discussions about different formats for translations. I only did a quick proof of concept and implemented (only a small part) of the banana format that Rich mentioned. But this alone could also be implemented as add-on, but the automatic prefixing of routes helps to remove the boilerplate in projects.

Hmm, maybe the routing and translation part could be split into two PRs, but since routes can also be translated, they work in tandem.

I currently use a store for the current local, translations and also localized links of the current page. I thought about just importing translations, but since one page can be displayed in several languages, they cannot be statically imported. Also for translations to be accessible in other components, I don't know how to avoid a store.

When switching locales, I just use a link with rel="external" to force a reload.

The current locale is "injected" with a locale specific component right before the layout. The localized links of the current page are "injected" with a page specific component right before the page component itself.

.replace('%svelte.head%', '" + head + "')
.replace('%svelte.body%', '" + body + "')};
.replace('%svelte.body%', '" + body + "')
.replace('%svelte.lang%', '" + lang + "')};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we need to add special support for %svelte.lang% here. You can do this with hooks: #641 (comment)

Copy link

@kobejean kobejean Apr 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the hooks for replacing '%svelte.lang%' for my own svelte-lang solution but what I don't like about using hooks is that if you have a blog post about svelte and i18n or something that has the text '%svelte.lang%' anywhere, it too will get replaced with lang because we are replacing the response body, whereas this way we can ensure that we are only replacing text in the template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly everything can be done without integrating i18n directly in SvelteKit, but a full integration could mean a better developer experience.

@benmccann
Copy link
Member

I wonder if we need built-in translation support? I think translations are already somewhat supported by using libraries like https://github.com/cibernox/svelte-intl-precompile. I haven't done much i18n, but I imagine there could be different file formats used for it and making it plugable would make it easier to support multiple

@cibernox do you have any thoughts on this PR since you'd been looking at i18n quite a bit recently?

@cibernox
Copy link

cibernox commented Apr 17, 2021

Some context.

i18n is supported and easy to integrate with i18n libraries, the most popular one being svelte-i18n. There are other options that I haven't tested with sveltekit, but svelte-i18n does work.

However all i18n libraries I could find, for svelte or for react/vue/you-name-it are runtime libraries. That is, users author their translations in a syntax, the ICU message format being the defacto standard, and the library ships a significant amount of code to tokenize, parse and evaluate translations in that syntax on the browser.

This works fine on all frameworks, but for svelte it seemed off-brand. What I did with https://github.com/cibernox/svelte-intl-precompile is copy what svelte does for apps and apply it to translations, analyzing and compiling entries in the ICU syntax at built time, so what the app receives is a dictionary of inline functions that can be invoked right away.

This makes apps much smaller because we don't have to ship a 50kb parser library, much faster because we don't have to parse, tokenize, memoize (because parsing is expensive) and evaluate a DSL language at runtime and even consume less memory because we don't memoize. It also enables treeshaking because we detect at built time the features we don't use (e.g. formatting dates) and don't ship code for them.

So far it has worked great on my apps. To give some numbers, we're talking about saving 90% of the size on the bundle and being around 400% faster for so-far-unseen translations (when translations have to be parsed and memoized) and ~25% faster for already cached translations.

If svelte or sveltekit was ever to have built-in i18n support, this approach seems more aligned with the general philosophy of the framework. I can't imagine it any other way.

I'll be giving a lighting talk about this on svelte summit next week. I'm also happy to upstream this to svelte-kit. I also think there's room for even more optimizations if we had a closer framework integration instead of a one-man show.

@pzerelles
Copy link
Contributor Author

pzerelles commented Apr 17, 2021

Your extension could easily replace the translation part in this prototype for runtime.

For the routes, I do translation directly in create_manifest_data and for links at runtime, I created a l() helper function, that uses t() under the hood and splits the url into segments and translates those, just like in create_manifest_data. Translating the full url might be a possibility, too, but I found it to be easier in create_manifest_data if segments are translated separately. Also for localized paths of a page, the same routing regex is used and only static parts of the match are translated.

Another idea - why not create an extension point for processing routes. Just a function that can be supplied in the config which takes a route's segments as array and gives back 1 to n translated arrays of segments. This way different i18n libraries could be used for the route translations. And this also would be a minimally invasive solution.

@cibernox
Copy link

Probably. The API of my library is identical to the svelte-i18n, so it should just be a matter of changing the import paths and adding one more vite plugin that is the one that transforms translations at built time using Babel.

@pzerelles
Copy link
Contributor Author

pzerelles commented Apr 19, 2021

I tried to create an extension point for routes. It currently fails to import svelte-intl-precompile into svelte.config.cjs due to "export * ...". Is there a workaround for that?

Importing the translation files directly is the same problem, but that would be enough since route translations are simple strings without interpolations.

I know that your api is the same as for svelte-i18n, but if you compile the translations, wouldn't it be nicer to access them as object keys and functions with ts signature maybe? Like {$t.welcome(foo)} instead of {$t("welcome", { name: foo })}

Some other thoughts about route translations. There are two cases where they need to be implemented, if routes are to be translated.

  1. For links: usually those will be in the same language and if only one language translation is loaded, that is enough. (can be done with i18n library)
  2. For language chooser: this is the problematic one, because we need to translate the current path to every available language. If we don't want to load all translations, I don't see how this can be done with external i18n libraries. Rich wrote about rel="canonical" but I also don't see yet how this can help to find the correct path in another language.

@vercel
Copy link

vercel bot commented Apr 19, 2021

@pz-mxu is attempting to deploy a commit to the Svelte Team on Vercel.

A member of the Team first needs to authorize it.

@floratmin
Copy link

However all i18n libraries I could find, for svelte or for react/vue/you-name-it are runtime libraries. That is, users author their translations in a syntax, the ICU message format being the defacto standard, and the library ships a significant amount of code to tokenize, parse and evaluate translations in that syntax on the browser.

First I thought the same, but except for javascript land the de facto standard is gettext. It is a little bit older and quirky but it has a superior workflow and a lot of tooling. When creating a multilingual site the workflow is extremely important. With gettext you get the following:

  • An lifecycle of extracting the latest message strings, updating previous translation files with new message strings, removing unused message strings from these translation files, and creating a compendium file containing all previous translations of message strings and variations of these translations.
  • There are many open-source compendia that could be used to kickstart open source projects.
  • A lot of editors already used by the majority of professional translators which know how to use these editors.
  • Web services to manage the translation workflow further and even provide automatic translations.
  • A wealth of extra information can be provided through a comment syntax to the translators.
  • Command-line tools for validating files, checking if all strings are translated, merging compendia...

I am working at the moment on a library that can help to automate this lifecycle and take some quirks out. To extract the messages function from your code there is the package gettext-extractor. We need only an adapter for .svelte files, which I already wrote as a proof of concept.

With gettext-extractor we can extract gettext strings or even we could use ICU message format strings in .po files (which would provide the same tooling for free). The pain point with ICU message format strings is that most translators would have to learn how to use the syntax in a proper way. Another point is that there could already be gettext compendia from prior versions of a program, which could partly not be used when you would be forced to switch the syntax to ICU message format. Maybe we could even provide the option of mixing gettext and ICU message format strings in the same file?

To extract the .po files there is the package gettext-to-messageformat which can convert .po files with gettext strings to ICU message format strings. If we use ICU message format strings in the .po files the extraction of the ICU message format strings into JSON files should be trivial. The messageformat package can use these JSON files to emit javascript code which can be embedded directly into the compiled svelte code. There are some helper functions that are imported by these functions, but all of these are together not even one kilobyte. And messageformat provides tiny wrapper functions to the Intl namespace which take a lot of pain out when formatting dates or numbers in different languages.

To provide this we would need up to three functions. One or two for gettext and one for ICU message format. Then we could develop some conventions for providing extra information to translators with comments. These functions could then be replaced by the actual string if there are no variables or a keyed object which is imported from a file containing all the translations and helper functions in one language. When developing the original function could just emit the English version of the text.

@pzerelles
Copy link
Contributor Author

Originally this was more about the things that cannot be done easily with external libraries. I think the discussion about which translation format should be used is already going in #553 and this PR is also too big currently. I will start a new PR that is just about the routing part for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants