From 24cd1c5dcfb6ef3cda7302c196e8705249e2de72 Mon Sep 17 00:00:00 2001 From: "Steven R. Loomis" Date: Fri, 22 Nov 2019 15:10:01 -0800 Subject: [PATCH 1/2] doc,deps: document how to maintain ICU in Node.js - update v8 guide to mention ICU - move content from the tools/icu/README.md but leave a pointer Fixes: https://github.com/nodejs/node/issues/26108 Co-Authored-By: Vse Mozhet Byt PR-URL: https://github.com/nodejs/node/pull/30607 --- doc/guides/maintaining-V8.md | 4 + doc/guides/maintaining-icu.md | 233 ++++++++++++++++++++++++++++++++++ tools/icu/README.md | 133 +++---------------- 3 files changed, 255 insertions(+), 115 deletions(-) create mode 100644 doc/guides/maintaining-icu.md diff --git a/doc/guides/maintaining-V8.md b/doc/guides/maintaining-V8.md index d33b745ac56b8e..733df451b2805f 100644 --- a/doc/guides/maintaining-V8.md +++ b/doc/guides/maintaining-V8.md @@ -313,6 +313,10 @@ Node.js keeps a vendored copy of V8 inside of the deps/ directory. In addition, Node.js may need to float patches that do not exist upstream. This means that some care may need to be taken to update the vendored copy of V8. +V8 builds against the version of ICU supplied by Node.js, +see [maintaining-icu.md](./maintaining-icu.md) for special considerations. +Specifically, a V8 update may necessitate an ICU update. + ### Minor updates (patch level) Because there may be floating patches on the version of V8 in Node.js, it is diff --git a/doc/guides/maintaining-icu.md b/doc/guides/maintaining-icu.md new file mode 100644 index 00000000000000..9aa7257498e505 --- /dev/null +++ b/doc/guides/maintaining-icu.md @@ -0,0 +1,233 @@ +# Maintaining ICU in Node.js + +## Background + +International Components for Unicode ([ICU4C][ICU]) is used both by V8 +and also by Node.js directly to provide internationalization +functionality. To quote from icu-project.org: + +> ICU is a mature, widely used set of C/C++ and Java libraries providing +> Unicode and Globalization support for software applications. ICU is +> widely portable and gives applications the same results on all platforms +> and between C/C++ and Java software. + +## Data dependencies + +ICU consumes and includes: + +* Extracted locale data from [CLDR][] +* Extracted [Unicode][] data. +* Time zone ([tz][]) data + +The current versions of these items can be viewed for node with `node -p process.versions`: + +```shell +$ node -p process.versions + +{ + … + cldr: '35.1', + icu: '64.2', + tz: '2019a', + unicode: '12.1' +} +``` + +## Release Schedule + +ICU typically has >1 release a year, particularly coinciding with a major +release of [Unicode][]. The current release schedule is available on the [ICU][] +website on the left sidebar. + +### V8 depends on ICU + +V8 will aggressively upgrade to a new ICU version, due to requirements for +features/bugfixes needed for [Ecma402][] support. The minimum required version +of ICU is specified within the V8 source tree. If the ICU version is too old, +V8 will not compile. + +```c +// deps/v8/src/objects/intl-objects.h +#define V8_MINIMUM_ICU_VERSION 64 +``` + +V8 in Node.js depends on the ICU version supplied by Node.js. + +The file `tools/icu/icu_versions.json` contains the current minimum +version of ICU that Node.js is known to work with. This should be +_at least_ the same version as V8, so that users will find out +earlier that their ICU is too old. A test case validates this when +Node.js is built. + +## How to upgrade ICU + +* Make sure your Node.js workspace is clean (`git status` +should be sufficient). +* Configure Node.js with the specific [ICU version](http://icu-project.org/download) + you want to upgrade to, for example: + +```shell +./configure \ + --with-intl=full-icu \ + --with-icu-source=http://download.icu-project.org/files/icu4c/58.1/icu4c-58_1-src.tgz +make +``` + +> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also, +> but the commands below are makefile-centric. + +* If there are ICU version-specific changes needed, you may need to make changes + in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`. + * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for + files to exclude. + +* Verify the Node.js build works: + +```shell +make test-ci +``` + +Also running + + + +```js +new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); +``` + +…Should return `enero` not `January`. + +* Now, copy `deps/icu` over to `deps/icu-small` + +> :warning: Do not modify any source code in `deps/icu-small` ! +> See section below about floating patches to ICU. + +```shell +python tools/icu/shrink-icu-src.py +``` + +* Now, do a clean rebuild of Node.js to test: + +```shell +make -k distclean +./configure +make +``` + +* Test this newly default-generated Node.js + + + +```js +process.versions.icu; +new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); +``` + +(This should print your updated ICU version number, and also `January` again.) + +You are ready to check in the updated `deps/icu-small`. This is a big commit, +so make this a separate commit from the smaller changes. + +> :warning: Do not modify any source code in `deps/icu-small` ! +> See section below about floating patches to ICU. + +* Now, rebuild the Node.js license. + +```shell +# clean up - remove deps/icu +make clean +tools/license-builder.sh +``` + +* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`. +It should match the ICU URL used in the first step. When this is done, the +following should build with small ICU. + +```shell +# clean up +rm -rf out deps/icu deps/icu4c* +./configure --with-intl=small-icu --download=all +make +make test-ci +``` + +* commit the change to `tools/icu/current_ver.dep` and `LICENSE` files. + + * Note: To simplify review, I often will “pre-land” this patch, meaning that + I run the patch through `curl -L https://github.com/nodejs/node/pull/xxx.patch + | git am -3 --whitespace=fix` per the collaborator’s guide… and then push that + patched branch into my PR's branch. This reduces the whitespace changes that + show up in the PR, since the final land will eliminate those anyway. + +## Floating patches to ICU + +Floating patches are applied at `configure` time. The "patch" files +are used instead of the original source files. The patch files are +complete `.cpp` files replacing the original contents. + +Patches are tied to a specific ICU version. They won’t apply to a +future ICU version. We assume that you filed a bug against [ICU][] and +upstreamed the fix, so the patch won’t be needed in a later ICU +version. + +### Example + +For example, to patch `source/tools/toolutil/pkg_genc.cpp` for +ICU version 63: + +```shell +# go to your Node.js source directory +cd + +# create the floating patch directory +mkdir -p tools/icu/patches/63 + +# create the subdirectory for the file(s) to patch: +mkdir -p tools/icu/patches/63/source/tools/toolutil/ + +# copy the file to patch +cp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \ +tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp + +# Make any changes to this file: +(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp ) + +# test +make clean && ./configure && make +``` + +You should see a message such as: + +```shell +INFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu" +``` + +### Clean Up + +Any patches older than the minimum version given in `tools/icu/icu_versions.json` +ought to be deleted, because they will never be used. + +### Why not just modify the ICU source directly? + +Especially given the V8 dependencies above, there may be times when a floating +patch to ICU is required. Though it seems expedient to simply change a file in +`deps/icu-small`, this is not the right approach for the following reasons: + +1. **Repeatability.** Given the complexity of merging in a new ICU version, +following the steps above in the prior section of this document ought to be +repeatable without concern for overriding a patch. + +2. **Verifiability.** Given the number of files modified in an ICU PR, +a floating patch could easily be missed — or dropped altogether next time +something is landed. + +3. **Compatibility.** There are a number of ways that ICU can be loaded into +Node.js (see the top level README.md). Only modifying `icu-small` would cause +the patch not to be landed in case the user specifies the ICU source code +another way. + +[ICU]: http://icu-project.org +[Unicode]: https://unicode.org +[tz]: https://www.iana.org/time-zones +[CLDR]: https://unicode.org/cldr +[Ecma402]: https://github.com/tc39/ecma402 diff --git a/tools/icu/README.md b/tools/icu/README.md index 89a70e7087ff23..1af39941b55431 100644 --- a/tools/icu/README.md +++ b/tools/icu/README.md @@ -1,8 +1,9 @@ # Notes about the `tools/icu` subdirectory -This directory contains tools, data, and information about the [ICU](http://icu-project.org) -(International Components for Unicode) integration. ICU is used to provide -internationalization functionality. +This directory contains tools, data, and information about the +International Components for Unicode integration. [ICU][] is used +both by V8 and also by +Node.js itself to provide internationalization functionality. * `patches/` are one-off patches, actually entire source file replacements, organized by ICU version number. @@ -18,119 +19,21 @@ internationalization functionality. * `README.md` — you are here * `shrink-icu-src.py` — this is used during upgrade (see guide below) -## How to upgrade ICU +Note: +> The files in this directory were written for the Node.js v0.12 effort. +> The original intent was to merge the tools such as `icutrim.py` and `iculslocs.cc` +> back into ICU. ICU has gained its own “data slicer” tool. +> There is an issue open, https://github.com/nodejs/node/issues/25136 +> for replacing `icutrim.py` with the [ICU data slicer][]. -* Make sure your Node.js workspace is clean (clean `git status`) should be - sufficient. -* Configure Node.js with the specific [ICU version](http://icu-project.org/download) - you want to upgrade to, for example: +## See Also -```shell -./configure \ - --with-intl=full-icu \ - --with-icu-source=http://download.icu-project.org/files/icu4c/58.1/icu4c-58_1-src.tgz -make -``` +* [docs/guides/maintaining-icu.md](../../doc/guides/maintaining-icu.md) for +information on maintaining ICU in Node.js -> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also, -> but the commands below are makefile-centric. +* [docs/api/intl.md](../../doc/api/intl.md) for information on the +internationalization-related APIs in Node.js +* [The ICU Homepage][ICU] -* If there are ICU version-specific changes needed, you may need to make changes - in `icu-generic.gyp` or add patch files to `tools/icu/patches`. - * Specifically, look for the lists in `sources!` in the `icu-generic.gyp` for - files to exclude. - -* Verify the Node.js build works: - -```shell -make test-ci -``` - -Also running - - - -```js -new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); -``` - -…Should return `enero` not `January`. - -* Now, copy `deps/icu` over to `deps/icu-small` - -```shell -python tools/icu/shrink-icu-src.py -``` - -* Now, do a clean rebuild of Node.js to test: - -```shell -make -k distclean -./configure -make -``` - -* Test this newly default-generated Node.js - - - -```js -process.versions.icu; -new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); -``` - -(This should print your updated ICU version number, and also `January` again.) - -You are ready to check in the updated `deps/small-icu`. This is a big commit, -so make this a separate commit from the smaller changes. - -* Now, rebuild the Node.js license. - -```shell -# clean up - remove deps/icu -make clean -tools/license-builder.sh -``` - -* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`. -It should match the ICU URL used in the first step. When this is done, the -following should build with small ICU. - -```shell -# clean up -rm -rf out deps/icu deps/icu4c* -./configure --with-intl=small-icu --download=all -make -make test-ci -``` - -* commit the change to `tools/icu/current_ver.dep` and `LICENSE` files. - - * Note: To simplify review, I often will “pre-land” this patch, meaning that - I run the patch through `curl -L https://github.com/nodejs/node/pull/xxx.patch - | git am -3 --whitespace=fix` per the collaborator’s guide… and then push that - patched branch into my PR's branch. This reduces the whitespace changes that - show up in the PR, since the final land will eliminate those anyway. - ------ - -## Postscript about the tools - -The files in this directory were written for the Node.js effort. -It was the intent of their author (Steven R. Loomis / srl295) to -merge them upstream into ICU, pending much discussion within the -ICU-TC. - -`icu_small.json` is somewhat node-specific as it specifies a "small ICU" -configuration file for the `icutrim.py` script. `icutrim.py` and -`iculslocs.cpp` may themselves be superseded by components built into -ICU in the future. As of this writing, however, the tools are separate -entities within Node.js, although theyare being scrutinized by interested -members of the ICU-TC. The “upstream” ICU bugs are given below. - -* [#10919](http://bugs.icu-project.org/trac/ticket/10919) - (experimental branch - may copy any source patches here) -* [#10922](http://bugs.icu-project.org/trac/ticket/10922) - (data packaging improvements) -* [#10923](http://bugs.icu-project.org/trac/ticket/10923) - (rewrite data building in python) +[ICU data slicer]: https://github.com/unicode-org/icu/blob/master/docs/userguide/icu_data/buildtool.md +[ICU]: http://icu-project.org From 5e50f4a595ff955dee50260f2879ac023f41ad18 Mon Sep 17 00:00:00 2001 From: Myles Borins Date: Wed, 27 Nov 2019 01:19:10 -0500 Subject: [PATCH 2/2] fixup: fix lint --- doc/guides/maintaining-icu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/maintaining-icu.md b/doc/guides/maintaining-icu.md index 9aa7257498e505..4add40b7fa2046 100644 --- a/doc/guides/maintaining-icu.md +++ b/doc/guides/maintaining-icu.md @@ -92,7 +92,7 @@ Also running ```js -new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); +new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); ``` …Should return `enero` not `January`. @@ -120,7 +120,7 @@ make ```js process.versions.icu; -new Intl.DateTimeFormat('es', {month: 'long'}).format(new Date(9E8)); +new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); ``` (This should print your updated ICU version number, and also `January` again.)