Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: advices to use full-icu package is outdated #36057

Closed
1 task done
Mumeii opened this issue Nov 9, 2020 · 17 comments
Closed
1 task done

doc: advices to use full-icu package is outdated #36057

Mumeii opened this issue Nov 9, 2020 · 17 comments
Labels
doc Issues and PRs related to the documentations. i18n-api Issues and PRs related to the i18n implementation.

Comments

@Mumeii
Copy link

Mumeii commented Nov 9, 2020

📗 API Reference Docs Problem

  • Version: 13.x.x and above. In fact, since node is delivered with full-icu by default
  • Platform: all
  • Subsystem: Duno, i18n maybe ?

Location

Affected URL(s):

Description

Since the first version delivered with full-icu by default, Node 13.x.x, the documentation keep advising to use full-icu package if we want to keep up to date with the latest i18n data.

But unfortunately, this package only fetch ICU data files for node versions built with small-icu support ...

See the corresponding piece of code here

Thus, it is miss leading the reader to be advised to use a tool which is in mutual exclusion with the system that is released by default.

Obviously, not all users of Node 13 and above will go for the default full-icu one, especially if they have memory footprint constraints.

But I guess that with the whole [ micro services / containerization / cloud ] hype those days, most peoples don't care much to finely tune there images and go for the official ones on Docker hub ... which themselves comes in a single flavor each, that is, the full one since Node 13 🍭


  • I would like to work on this issue and
    submit a pull request.
@Mumeii Mumeii added the doc Issues and PRs related to the documentations. label Nov 9, 2020
@mhdawson
Copy link
Member

mhdawson commented Nov 9, 2020

@Mumeii fixing that up sounds good. Any chance you want to submit a pull request ?

@Mumeii
Copy link
Author

Mumeii commented Nov 10, 2020

@mhdawson : Hi !

I wouldn't mind submitting a PR if I wasn't still puzzled :

In fact, I can't tell :

  • wether it's a documentation or a full-icu package trouble.
  • why functionally speaking this tool don't want to load ICU data files for Node releases compiled with full-icu

From our development team point of view, it seems it would make sense if it was also working on full-icu release.

Picture the trouble we're facing right now due to that restriction :

  • Until recently, we've wrongly defined our micro services images over unstable tags of the official node image on Docker hub, such as lts-slim or lts-alpine
  • It bitten us when those lts tags recently moved from Node 12 to 14
  • So now we want to use complete version references in the tags we're using for our image. So far so good !
  • But from what we see, we can't upgrade to a fixed node:14.x.y-alpine image version without loosing the always up to date ICU data effect because of the issue we're reporting here
  • So we're staying on a 12.x.x version and asking around what is the best thing to do 😋

We've also thought about another work around, that would necessitate to :

  • refer to a less strict node version for our images, node:14-alpine for exemple
  • stop using full-icu package.
  • hope that the release frequency of [ ICU data / Node / our services ] would be high enough to avoid long outdated i18n data gaps

But :

  • we couldn't find a documentation explaining which is the policy of the official Docker image delivery : is it always delivered in the default icu flavor associate to the version ? Or could it change within a given major version ?
  • we're not certain that each node:14* release will come along with the most up to date ICU data : Which is the icu data set selection process before each full or small releases ?
  • we still would like to take the opportunity of each of our micro services delivery to fetch the most up to date ICU data version, and thus avoid potential "outdated i18n data"

Another "work around" would be to be able to base our CI on some kind of node:14-alpine-smallIcu tag, in order to have the full-icu package doing its job.

But can't tell if that solution is viable from a docker image management point of view, and if more than an handful of projects would benefits having such images available anyway.

Do you have stats 📉 📈 somewhere that would help telling which is the usage ration between both flavor ?


I know the whole thing could seems to be a bit "too much", but we're awaiting a CLDR fix for a Latvian date formatting trouble, and we'd like to rely on our CI deliveries to insure of the fix delivery to our customers rather than human action

@richardlau
Copy link
Member

cc @nodejs/i18n-api

@srl295
Copy link
Member

srl295 commented Nov 10, 2020

full-icu is still helpful IF you manually build with small icu. I revamped the main READMEs (such as BUILDING.md) and embedded docs when full-by-default landed in #29522 , but obviously these docs you cited did not follow suit.

@srl295
Copy link
Member

srl295 commented Nov 10, 2020

the documentation keep advising to use full-icu package if we want to keep up to date with the latest i18n data.

Not exactly. It should be advising you to use full-icu if you want to have i18n data, if your node build didn't originally have i18n data (in the small icu case). It should not be understood that full-icu is some kind of update mechanism to produce data that's any more up to date than any other method. If it says that, it's always been wrong and should be changed.

But unfortunately, this package only fetch ICU data files for node versions built with small-icu support ...

it could fetch data files for all versions.

@srl295
Copy link
Member

srl295 commented Nov 10, 2020

why functionally speaking this tool don't want to load ICU data files for Node releases compiled with full-icu

because if it is compiled with full-icu there's nothing else to load. As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism. Updating the data is a more complex problem which is not addressed at all by the current tooling, please discuss it with the ICU TC ( icu-project.org ). The best way to get the latest ICU data is to upgrade to the latest ICU.

@mhdawson
Copy link
Member

@srl295 thanks for jumping in with the clarifications.

@srl295
Copy link
Member

srl295 commented Nov 10, 2020

@Mumeii soorry I did not analyze your whole situation as much before.

But from what we see, we can't upgrade to a fixed node:14.x.y-alpine image version without loosing the always up to date ICU data effect because of the issue we're reporting here
So we're staying on a 12.x.x version and asking around what is the best thing to do 😋

To be clear, This is a misunderstanding. You are not up to date by staying on 12. Quite the opposite. So upgrade to 14 and dont worry ablut full icu anymore.

@Mumeii
Copy link
Author

Mumeii commented Nov 10, 2020

@srl295 : Hi !

@Mumeii sorry I did not analyze your whole situation as much before.

No troubles. Thanks for your feedbacks and the clarifications 🙏

If I understand you the right way, it mean that it's the documentation that must be updated.

Thus, can you help me one last time, in order for me to make a sound documentation PR ?

Am I wrong saying that :


As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism


Mean that :

  1. Before Node 14 became the lts version, a CI relying on both a node:lts-alpine and full-icu package usage was simply constantly downloading the same ICU data set over and over again at each builds, no matter which newer ICU versions could be available.

  2. It was the only possible option in order to have both a full ICU data set and a CI based on an official node Docker image.
    At least as long as lts-alpine tag was pointing to a, small-icu based, 12.x.y version

  3. Due to the complexity of the ICU data integration, it's not possible to design a one size fit for all import mechanism. So the job is done once for each Node release, and the only compatible ICU version is the one referenced by process.versions.icu at runtime


So upgrade to 14 and dont worry about full icu anymore


Mean that :

  1. The constraint I refer to in 3. is related to the a whole Node major version, not a single major.minor.patch version.
    Which mean that the binding to the latest version of ICU data available is "casted into stone" during the build of a Node JS
    major version release candidate ?

The best way to get the latest ICU data is to upgrade to the latest ICU


Mean that :

  1. the only way to achieve what we think we were doing on our CI would be to fork the Node JS code base and to manage ourself the binding with ICU data after each of their version release ?

Once again :

  • thank you for your reply, your help is greatly appreciated.
  • sorry for asking question that may seems obvious to you, but if I want to propose a PR on the documentation, I'd rather be certain to fully understand the problematic 😝

@srl295
Copy link
Member

srl295 commented Nov 10, 2020

@srl295 : Hi !

@Mumeii sorry I did not analyze your whole situation as much before.

No troubles. Thanks for your feedbacks and the clarifications 🙏

👍

If I understand you the right way, it mean that it's the documentation that must be updated.

Yes.

Thus, can you help me one last time, in order for me to make a sound documentation PR ?

Am I wrong saying that :

As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism

Mean that :

  1. Before Node 14 became the lts version, a CI relying on both a node:lts-alpine and full-icu package usage was simply constantly downloading the same ICU data set over and over again at each builds, no matter which newer ICU versions could be available.

Correct.

In fact, the way full-icu was currently implemented, if you have ICU 63.1 it would not properly fetch ICU 63.2 data (see closed issue nodejs/full-icu-npm#35 because that specific issue was moot).

  1. It was the only possible option in order to have both a full ICU data set and a CI based on an official node Docker image.
    At least as long as lts-alpine tag was pointing to a, small-icu based, 12.x.y version

This is not correct, full-icu is definitely not the only possible option. The docs you point to give a number of other options. full-icu was intended to be a convenience for users, with a short list of steps, rather than having to " go find file X to download", "install file X here", etc. It worked in a lot of cases, but not all of them.

So there are other ways to do what full-icu does, but it was intended to be the simplest one.

  1. Due to the complexity of the ICU data integration, it's not possible to design a one size fit for all import mechanism. So the job is done once for each Node release, and the only compatible ICU version is the one referenced by process.versions.icu at runtime

Correct. There is no compatibility at build time for any other major version.

So upgrade to 14 and dont worry about full icu anymore

Mean that :

  1. The constraint I refer to in 3. is related to the a whole Node major version, not a single major.minor.patch version.
    Which mean that the binding to the latest version of ICU data available is "casted into stone" during the build of a Node JS
    major version release candidate ?

ICU almost never releases even minor version changes of the data. (There is a whole separate process around updating time zone data but I will exclude that for now.) I can not remember a case where. would recommend a data update separate from the code update. So I don't think the scenario 'get the latest official ICU data' really happens in practice, besides upgrading from, say, ICU 67 to ICU 68.

The best way to get the latest ICU data is to upgrade to the latest ICU

Mean that :

  1. the only way to achieve what we think we were doing on our CI would be to fork the Node JS code base and to manage ourself the binding with ICU data after each of their version release ?

No. It means:

  1. use a newer Node version (a major or minor update) which itself includes an ICU update, or,
  2. use a compile-time option in Node to choose a later ICU., such as configure … --with-icu-source=<URL to later ICU>

Once again :

  • thank you for your reply, your help is greatly appreciated.
  • sorry for asking question that may seems obvious to you, but if I want to propose a PR on the documentation, I'd rather be certain to fully understand the problematic 😝

You are very welcome. As of mid- year this is suddenly not part of my paid job at all, so it would be great to have others help with updating the documentation!

@Mumeii
Copy link
Author

Mumeii commented Nov 11, 2020

@srl295 :

use a newer Node version (a major or minor update) which itself includes an ICU update

Ok, so from a Node M.m.p to the next M.m+1.0 release, the version of process.versions.icu can vary ?

In that case I guess the simplest way having our project staying up to date with the i18n data would be to move from a, let's say, node:14.x.y-alpine to a node:14-alpine

Good to know.


About the 2., it was a cumbersome way for me to mention that :

  • Before Node 14 been lts, if one wanted a lts full-icu node docker image without using the full-icu package, he had to deal with a node compilation himself, and thus the Dockerfile writing
  • After Node 14 been lts, it's the other way around : one will have to deal with a Node JS compilation & Dockerfile writing if he wants a small-icu version

It always feel like a drawback to me when I can't get this lts seal of quality straight out of Docker hub; or said another way, I always feel like I'm going to mess up something important when having to perform those last building steps myself.

That's why I feel stuck between not been eager to deal with a compilation, and still wanting to be able to choose the level of i18n data pre-package integration.

Having those extra versions built and released as Docker hub official image would be great (at least for me 👅), but I guess that it would be hell of a mess to build, to tag, and to manage on the long run.

@sgallagher
Copy link
Contributor

@Mumeii What we do in Fedora, we always build with small-icu, but we also pass --with-icu-default-data-dir=<...> to the configure. The result is that we compile with the capability to carry the full ICU package (which we provide as a subpackage called nodejs-full-i18n) by just dropping the other language data into a well-known path.

IMHO, the ideal case would be for the official Node container images to follow this same pattern; build with small-icu and provide a simple mechanism (such as a bash script) to the data directory.

@srl295
Copy link
Member

srl295 commented Jan 4, 2021

@sgallagher (hi!) yes, either use the full icu in the image, or do as you said.

@targos targos added the i18n-api Issues and PRs related to the i18n implementation. label Aug 9, 2021
@RedYetiDev
Copy link
Member

The version specified in this issue has reached EoL. If this is no longer relevent, feel free to self-close.

@mhdawson
Copy link
Member

@RedYetiDev was this changed in later versions of the doc as it says version 13 and above. If you checked an it's fixed in recent versions then I agree it can be closed, otherwise it would still apply.

@RedYetiDev
Copy link
Member

It looks fixed to me, but given the conversation around it, I wanted confirmation before closing the issue

@srl295
Copy link
Member

srl295 commented Jul 12, 2024

If current docs are fine then close it. People DO still build and use small-icu with custom builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. i18n-api Issues and PRs related to the i18n implementation.
Projects
None yet
Development

No branches or pull requests

7 participants