Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for runtime translations #305

Closed
wants to merge 13 commits into from

Conversation

bamorim
Copy link

@bamorim bamorim commented Apr 3, 2022

Closes #280.

Since I don't think it makes sense to use defoverridable if this is meant to be part of the core, I changed from using super to just renaming the actual compile-time implementation functions to lgettext_compiled and lngettext_compiled and then just wrapping the call to that from lgettext/lngettext in different ways, depending on whether repo is defined or not.

lib/gettext/compiler.ex Outdated Show resolved Hide resolved
lib/gettext/ets_repo.ex Outdated Show resolved Hide resolved
@josevalim
Copy link
Contributor

Thank you! I will review the PR with more detail later. For now I just want to say that the ETS repo should not be part of Gettext. We will need to define a repository for tests though in the test helper, but that can likely be done with something simpler, otherwise ETS or agent.

lib/gettext/repo.ex Outdated Show resolved Hide resolved
@bamorim
Copy link
Author

bamorim commented Apr 3, 2022

I removed the ETS repo from here. I agree it probably makes sense to not be included.

Thanks <3

Copy link
Contributor

@whatyouhide whatyouhide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking like a great start. With this, I'm thinking we can probably create a Gettext.CompiledRepo and shove all the precompiled translations in there, right? So that the compiled static translations repo is just another way of getting translations.

lib/gettext/repo.ex Outdated Show resolved Hide resolved
@josevalim
Copy link
Contributor

Looking like a great start. With this, I'm thinking we can probably create a Gettext.CompiledRepo and shove all the precompiled translations in there, right?

We had a discussion along this line, but the issue is that the compiled repo needs to do specific compile time behavior at completion time. So we would actually need to define a repo module per backend at compilation time and I don't think that's worth it.

@bamorim, we should probably make the repo configuration be {repo, arg}, so we can do stuff like configuring the ETS table name. Or alternatively a {mod, fun, args}. Any preferences @whatyouhide?

@bamorim
Copy link
Author

bamorim commented Apr 4, 2022

@whatyouhide as @josevalim mentioned, that would be a big change so I don't think it is worth right now.
I particularly think having a "CompiledRepo" being generated "looks more clean", but it would be a lot of changes. Also, falling back to the compile time is important, so that would mean we need multiple repos, which adds a little bit to the complexity.

One way I can see us going on the route of multiple repos + compile time repo is to later add the :repos option which by default would be something like:

repos = case {opts[:repos], opts[:repo]} do
  {nil, nil} -> [CompileTimeRepo]
  {nil, repo} -> [CompileTimeRepo, repo]
  {repos, _} -> repos
end

That would give us time to think whether this CompileTimeRepo is actually worth and introduce the idea in a backwards-compatible way.

@josevalim as for the repo receiving an argument, I was thinking about that when implementing the test. It might be good to have that, but would that mean we also need something like repo.init (alike Plug)?

The problem with {mod, fun, args} is that currently we have two different methods for plural vs non plural (and they have different arities because plural needs to pass the plural form), but this could be circumvented by having something like:

  @type translation_id() ::
          {:singular, locale(), domain(), msgctxt(), msgid()}
          | {:plural, locale(), domain(), msgctxt(), msgid(), plural_form()}

So that the repo is just a /2 function.

Taking inspiration from Plug, we could even make so that repo: :get_translation is just a call to mybackend.get_translation(id, opts) or something like that.

@josevalim
Copy link
Contributor

{mod, arg} with init sounds good to me then!

@bamorim
Copy link
Author

bamorim commented Apr 13, 2022

Just an update on that. Last weekend I couldn't find time to work more on that. Will try again this weekend.

@bamorim
Copy link
Author

bamorim commented Apr 16, 2022

@josevalim I've made the suggested change I was in doubt whether to call init in compile time or runtime, so I'll leave up to discussion. For now I'm calling at compile time following how Plug normally works.
The downside is that this now there is a compile-time dependency between the Gettext backend and the repo, but I think this is okay. It also opens the possibility of maybe, in the future, making the compilation of the po files in that init callback, for example and maybe moving the default behavior to a repo itself.

@josevalim
Copy link
Contributor

As long as the repository is passed at compilation time, Then it is fine to call init at compile time.

@bamorim bamorim marked this pull request as ready for review April 18, 2022 13:00
@bamorim
Copy link
Author

bamorim commented Apr 18, 2022

I think I'm done here. Is there anything missing? Is this something we would like to move forward with?

Also, thanks for all the help <3

@bamorim bamorim force-pushed the runtime-gettext branch 2 times, most recently from f784fe0 to fd9024b Compare April 25, 2022 17:13
@bamorim
Copy link
Author

bamorim commented May 10, 2022

@josevalim @whatyouhide Hey, sorry to bother you.

Is there anything that you would like to see here that is missing?
Would you like to try a different approach? I could try something different if needed.

@josevalim
Copy link
Contributor

Unfortunately I picked up a hand injury which makes my contribution time quite limited. So I won't be able to take this forward. Sorry :-(

@bamorim
Copy link
Author

bamorim commented May 16, 2022

Hey, that is sad @josevalim. Wishing you a fast recovery. Anytime you would like just ping me here and I can get back at it, for now recovering is more important. <3

@jc00ke
Copy link

jc00ke commented Aug 13, 2022

I can see this feature being of great value to us soon, so if there's anything I can do to help out, please let me know. I hope your hand is healed up by now @josevalim! ⚕️ ✋

lib/gettext/repo.ex Show resolved Hide resolved
test/gettext_test.exs Outdated Show resolved Hide resolved
lib/gettext/repo.ex Outdated Show resolved Hide resolved
@bamorim bamorim requested review from whatyouhide and removed request for josevalim August 20, 2022 20:24
lib/gettext/repo.ex Outdated Show resolved Hide resolved
lib/gettext/repo.ex Outdated Show resolved Hide resolved
@whatyouhide
Copy link
Contributor

@bamorim tests seem to be failing? 🤔

Comment on lines +320 to +340
case unquote(repo).get_translation(locale, domain, msgctxt, msgid, unquote(repo_opts)) do
{:ok, msgstr} ->
unquote(interpolation).runtime_interpolate(msgstr, bindings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually leave it as responsibility of the runtime backend to call interpolation, specially now that the interpolation module is public API. This will give more flexibility too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also do that for the plural module?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case no because I can’t think of them having different plural rules.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why would one change the interpolation module on a per-message basis? In the current way they can already replace on a per-translator basis.

If we require the repo to implement that, this would mean that a change from one interpolator to another now would need to be done in two different places (in the translator where use Gettext is called) and in the repo itself (or at least by passing as a parameter on the repo configuraiton.

I think that would be more confusing, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote that the interpolation is done outside the implementation. Ther reason for this is, that nothing is preventing me from interpolating values inside the implementation as well. This way you can do whatever you want inside the implementation and we will make sure that interpolation is handled if there is bindings remaining in the message.

As for the pluralalization: It should normally always be the same for a given locale and is a bit complicated to get right. I would therefore:

  • Add optional @callback get_plural_forms(locale()) :: String.t()
  • Call Gettext.Plural.init/1with the locale and plural_forms_header set to the result of the callback to get the Gettext.Plural.plural_info
  • In the translation function call Gettext.Plural.plural/2 with the plural_info
  • Pass the resulting plural form index to the adapter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see there are two "philosophies" here and the path we choose I guess should be similar for both the plural forms and interpolation cases:

Options

Leave most of the implementation to the repo

In that case, we should:

  • Just pass n to the repo and let it handle the plural_form part
  • Don't interpolate anything and leave that to the repo as well

Implement "sane defaults"

For the interpolation part, I guess I agree with @maennchen: they can interpolate on their side even if we interpolate again here.

For the plural part, I guess we could just have an optional callback as he suggested. However we probably want to pass more info to that, something like: repo.plural_info(locale, %{domain: domain, plural_mod: plural_mod}). The reason for including domain is that it might be the case where in the same locale we need different nplurals like the chinese example given in #343 (comment)

This mean we could do:

ensure_loaded!(repo)
ensure_loaded!(plural_mod)

plural_info = 
  cond do
    function_exists?(repo, :plural_info, 2) -> 
      quote do: unquote(repo).plural_info(var!(locale), %{domain: var!(domain), plural_mod: var!(plural_mod)})
    function_exists?(plural_mod, :init, 1) ->
      quote do: unquote(plural_mod).init(%{locale: var!(locale)})
    true ->
      quote do: var!(locale)
  else

  # ...
  plural_form = unquote(plural_mod).plural(unquote(plural_info), n)
  case unquote(repo).get_plural_translation(
               locale,
               domain,
               msgctxt,
               msgid,
               msgid_plural,
               plural_form,
               unquote(repo_opts)
             )

Some thoughts

Two scenarios I see for runtime translations are:

  • Automatically sync .po files using something like s3fs or Serge
  • Implement an in-app translation where translations could be done in an internal admin panel (probably storing all translations in an Ecto database)

For the first case, letting the implementation decide on the plural form should be not an issue since such solution would already require some knowledge of how .po files work (since they need to parse it and because the plural forms can change when re-syncing the files). But this is a slightly more advanced use case and can probably be solved by implementing plural_info.

For the second case, passing the plural_form would make so an unexperienced implementer have a direct mapping between the Gettext.Repo callback signature and an Ecto.Schema avoiding them having to think too much on how to convert n into plural_form so they can easily implement something like:

defmodule MyApp.GettextRepo do
  def get_plural_translation(locale, domain, msgctxt, msgid, msgid_plural, plural_form, _opts) do
    case MyApp.Repo.get_by(
      MyApp.Translation,
      locale: locale,
      domain: domain,
      msgctx: msgctx,
      msgid_plural: msgid_plural,
      plural_form: plural_form
    ) do
      nil -> :not_found
      %Translation{msgstr: msgstr} -> {:ok, msgstr}
    end
  end
end

Which is pretty simple and easy to implement without digging too deep into how Gettext work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josevalim @whatyouhide sorry for the tag, whenever you had time could you give your thoughts on the matter? I can then make the changes depending on what you folks prefer.

(I'm tagging because it could not be clear that we would like your opinion and not just a discussion between me and @maennchen, sorry for the notification)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not sure here. I agree on the point that we want to handle interpolation anyways in Gettext, and implementers of a Gettext repo can do whatever they want, including interpolating. As for the locale, how does it relate to d55aeb0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to keep it simple, d55aeb0 didn't directly affected this PR but created a gap in feature parity between runtime and compile-time.

Before, the runtime version was computing the plural_forms based the plural module and sending it to the repo without a chance of the repo choosing it's plural form. To give a clearer use case:

  • Imagine a runtime repo that returns translations based on gettext files that are synced from some object store like s3.
  • If the code was computing plural form based on the count and sending directly to the repo, the repo couldn't make the decision based on the Plural-Forms header (feature added in that PR for the compile time option)

To allow for that, I first thought about just sending count to the runtime repo and letting it handle the plural decision however it wishes. However, because plural form rules can be complicated @maennchen suggested to me that we should compute plural forms using the plural module defined but allow for an optional callback to get plural information.

@luka-TU
Copy link

luka-TU commented Jan 18, 2023

@bamorim no need to apologize! Hope everything is better now. I just had similar request and then found out this cool PR :) Let me know if I could be of help.

lib/gettext/compiler.ex Outdated Show resolved Hide resolved
@coveralls
Copy link

Pull Request Test Coverage Report for Build cb5aabb893ffe35e0fa36ebcc00351fb2e1fd57d-PR-305

  • 4 of 4 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.07%) to 90.669%

Totals Coverage Status
Change from base Build e5ba0651805b3b777b0018ce276e950521dab18f: 0.07%
Covered Lines: 515
Relevant Lines: 568

💛 - Coveralls

@whatyouhide whatyouhide changed the title feat: runtime translations Add support for runtime translations Jan 21, 2023
lib/gettext/compiler.ex Outdated Show resolved Hide resolved
@@ -0,0 +1,39 @@
defmodule Gettext.Repo do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that the name "repo" makes sense here. Should we call this something like Gettext.TranslationFetcher? After all, the documentation says that this is a "behaviour for modules that can fetch Gettext translations".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, good point. I don't think Repo is completely bad though, as fetching a translation will fetch from a place where the transltions are "stored", a "translation repository". However, TranslationFetcher or MessageFetcher could reveal better the intention of retrieving msgstr/translation.

Summing up, aesthetic-wise I think Repo is nicer but maybe it is not clear enough. I'd be down to whatever you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also put in the fact that it is runtime fetching into the name as well. If we for example ever allow a compiled strategy that reads .mo files, we could come up with another compile time strategy, which would for sure have a different set of callbacks than the runtime ones have.

=> Gettext.RuntimeTranslationFetcher ?

Co-authored-by: Andrea Leopardi <an.leopardi@gmail.com>
@szsoppa
Copy link

szsoppa commented Mar 29, 2023

Hey guys, we built an open-source tool based on this feature (https://github.com/curiosum-dev/kanta). Can I help somehow to finish this PR? :)

@bamorim
Copy link
Author

bamorim commented Apr 12, 2023

@szsoppa I think the pending discussion was around responsibilities as discussed here.

If it was up to me, I'd go with the sane defaults approach. If people think this is a good idea, I think it should take me an afternoon to implement that code.

@kipcole9
Copy link
Contributor

kipcole9 commented Nov 2, 2023

I'm curious if there is still an intention to finish this up and merge?

@bamorim
Copy link
Author

bamorim commented Nov 2, 2023

Hey @kipcole9 , sorry, I took a time away from any OSS contribution and public speaking because I was no in the best state of mind.
Id love to be able to wrap it up. I need to get back to the pending discussions to understand what is missing.

@kipcole9
Copy link
Contributor

kipcole9 commented Nov 3, 2023

No need to be sorry at all!

I think it's a valuable contribution but everyone contributing OSS has to balance a lot of priorities so I understand your challenge.

Thanks for making such a big effort already.

@vitalis
Copy link

vitalis commented Mar 8, 2024

👍🏻

@vitalis
Copy link

vitalis commented Jun 30, 2024

Dear @bamorim , hope you are better, it will be really amazing to add this functionality to the library.

@Gladear Gladear mentioned this pull request Aug 9, 2024
@peaceful-james
Copy link

peaceful-james commented Sep 4, 2024

Is anyone working on this?

I am experiencing friction with trying to use latest :ex_cldr_routes and :kanta at the same time.

elixir-cldr/cldr_routes#19

https://github.com/curiosum-dev/kanta?tab=readme-ov-file#installation

@whatyouhide
Copy link
Contributor

So, I think it's time to revisit this @josevalim and @maennchen.

Now that we have use Gettext, backend: ... and Gettext.Backend is fully documented, could runtime translations be implemented as a custom backend? There might be some edge smoothing involved but it should work. I’m not sure why someone would want to use Gettext if they're not storing translations in PO files---using something entirely different for fetching translations sounds more appropriate?

Thoughts?

@maennchen
Copy link
Member

@whatyouhide Agreed. This PR is lingering for way too long already.

I think we have to consider the scope we want this library to have in general. There's two factors which create the demand for custom backends:

Non Gettext Backends

We could scope this library to be a generic translation library with a standardized interface for developers. This or external libraries could then implement backends for other formats like XLIFF or even runtime translations.

Runtime Translations

Some libraries like Kanta want to inject translations at runtime so that they can offer a management interface.

I personally don't think that this library should support any backend. This is because this library was always focused on gettext itself and that is reflected in all the module / function names. It would be strange to call a function called gettext("message") when the actual implementation was anything else besides gettext. I would rather create a new generic translation library where gettext is one of the implementations.

I however think that the ability to change translations at runtime would be a good addition. With the changes done to improve compile time dependencies, I don't think that it is necessary to change anything in gettext. Kanta can just generate a new .po file and recompile the backend module at runtime.

@maennchen
Copy link
Member

maennchen commented Sep 5, 2024

We could add some tools for runtime recompilation that would make it simpler to implement for libraries like kanta.

Something like this:

# When saving the files on disk
Gettext.recompile_backend_from_files(BackendName, "path/to/files")

# When dynamically reading translations from the DB
Gettext.recompile_backend_from_messages(BackendName, %{
  locale: %{
    domain: %Expo.Messages{...}
  }
})

@whatyouhide
Copy link
Contributor

@maennchen recompile_backend_from_files, wouldn't that be just a normal recompilation when files are changed? Backends have @external_resource on the PO files IIRC. It would sort of be like livereload in Phoenix, is it something we need to support in this lib?

recompile_backend_from_messages is more risky as we have to expose a lot of API about how to structure the messages. Also, if someone is storing messages in a DB, again what's the point of using Gettext in the first place?

I would rather create a new generic translation library where gettext is one of the implementations.

This is exactly what I’m talking about yeah!

@maennchen
Copy link
Member

maennchen commented Sep 5, 2024

@whatyouhide The idea with a translation manager would be that it can load the translations from the file into a db and store it back into a file on change. We would then recompile.

There's also tools like lokalise that do this as a service. (with webhooks etc.)

As long as it's just your development environment, just writing into the file and letting phoenix live reload it is a good strategy. But a lot of products go live without having all languages fully translated. They then pay a translator to fill in the blanks.

I have used workflows like this before but have never really liked them. The reason is that we always got a mess with translations in code and in the repository diverging.

If I had to implement something like this myself again, I would go the DevOps route where changes in the translation tool cause a commit / PR and it will automatically redeploy afterwards.

@maennchen
Copy link
Member

recompile_backend_from_files, wouldn't that be just a normal recompilation when files are changed?

True, not needed.

recompile_backend_from_messages is more risky as we have to expose a lot of API about how to structure the messages. Also, if someone is storing messages in a DB, again what's the point of using Gettext in the first place?

The reason why I included this one was to allow to avoid the work of serializing / deserializing. But that's probably not strictly needed.

@szsoppa
Copy link

szsoppa commented Sep 11, 2024

recompile_backend_from_messages is more risky as we have to expose a lot of API about how to structure the messages. Also, if someone is storing messages in a DB, again what's the point of using Gettext in the first place?

The main reason we chose to build kanta on top of gettext is that gettext is used in nearly all Elixir projects that intend to translate their application, either now or in the future. I agree that the best case scenario would be to provide new helpers and remove the dependency with gettext, but in reality that would mean a huge number of changes in existing projects and same number of changes when rolling back to gettext (if someone would not be interested in kanta/other tool anymore).

@peaceful-james
Copy link

I do not mean to be pushy but is there anyone willing to actually claim this work? I do not understand much of the discussion but it sounds like some people do not want this to be merged? If nobody claims the work then I will do my best to pick it up. Disclaimer: it will probably take me until 2025 to understand the basics of what is happening.

@maennchen
Copy link
Member

@szsoppa Have you heard of igniter? Given that a generic translation library is written that could support multiple backends like gettext but also any other, Igniter could be used to rewrite all Gettext calls with different ones. Specifically Igniter.Refactors.Rename.rename_function/4 should do the trick and would allow you to define a simple migration path.


@peaceful-james The current sentiment of the project is that gettext is not trying to support any backend other than gettext files. It is not trying to be a generic backend independent translation library.

Based on this, Kanta and any other tool that want to change translations at runtime has two choices:

  • The tool can write gettext .po files onto disk and recompile the module at runtime using gettext.
  • The tool can define / use a generic translation library instead of gettext that is designed to handle different backends and runtime reloading.

We've made the effort to extract expo from gettext, which is the library that actually deals with parsing gettext files. THis can also be used to aid efforts to import / export translation in a gettext format.


Given that there has been a lengthy discussion on this pull request without recent activity, I'd like to clarify the situation. This PR has been open for a very long time without a reaction, and we appreciate the author's work and patience. Since there's a tool called Kanta that depends on a fork based on this PR, it's important to make a decision. @whatyouhide, could you please confirm if our decision regarding this PR is final? If so, I'd like to close it.

@whatyouhide
Copy link
Contributor

Yes, @maennchen put it in a fantastic way. Thank you all for the in-depth discussion 💟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Database-backed Gettext