src: use STL containers instead of v8 values for static module data #24384

joyeecheung · 2018-11-15T22:26:54Z

Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::unordered_maps. This has the following benefits:

It's slightly lighter in weight compared to storing things on the
v8 heap. Also it may be slightly faster since the preivous v8::Object is
already in dictionary mode - though the difference is very small
given the number of native modules is limited.
The source and code cache generation templates are now much simpler
since they just initialize static arrays and manipulate STL
constructs. They are also easier to debug from the C++'s side,
especially early in the bootstrap process when no inspector
can be attached.
The static native module data can be accessed independent of any
Environment or Isolate, and it's easy to look them up from the
C++'s side.
It's now impossible to mutate the source code used to compile
native modules from the JS land since it's completely separate
from the v8 heap. We can still get the constant strings from
process.binding('natives') but that's all.

A few drive-by fixes:

Remove DecorateErrorStack in LookupAndCompile - We don't need to
capture the exception to decorate when we encounter
errors during native module compilation, as those errors should be
syntax errors and v8 is able to decorate them well. We use
CompileFunctionInContext so there is no need to worry about
wrappers either.
The code cache could be rejected when node is started with v8 flags.
Instead of aborting in that case, simply keep a record in the
native_module_without_cache set.
Refactor js2c.py a bit, reduce code duplication and inline Render()
to make the one-byte/two-byte special treatment easier to read.

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

nodejs-github-bot · 2018-11-15T22:26:56Z

@joyeecheung build started: https://ci.nodejs.org/blue/organizations/jenkins/node-test-pull-request-lite-pipeline/detail/node-test-pull-request-lite-pipeline/1626/pipeline

joyeecheung · 2018-11-15T22:28:16Z

CI: https://ci.nodejs.org/job/node-test-pull-request/18661/
Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/275/

cc @addaleax @devsnek @refack

joyeecheung · 2018-11-15T22:30:14Z

Also @hashseed as I understand there is no way to use the cache if node is started with v8 flags which leads to a mismatch of the flag hash in the serialized code cache data, is that correct?

addaleax · 2018-11-15T22:32:40Z

src/node_native_module.cc

Hm, maybe this one could better be implemented as an ToV8Value() overload from util.h? It fits in with the others pretty well, I think (the downside being that we always use UTF8 there unconditionally…)

@addaleax The current implementation is only capable of converting std::set<const std::string> to an array...not exactly what we want here, but I can leave a TODO

addaleax · 2018-11-15T22:40:32Z

src/node_native_module.h

btw, did you consider using std::unordered_map and std::unordered_set over std::map/std::set? I don’t know if we need to worry a lot about the performance tradeoffs here, but I personally tend to use the unordered versions when I can

I don't think there is a reason not to use the unordered version...good idea, I'll change that

src/node_native_module.h

tools/generate_code_cache.js

refack

Wow! 😮 Looks great.

refack · 2018-11-16T14:11:18Z

@joyeecheung could you gist the new node_javascript.cc and node_code_cache.cc? I'm curious how did they turn out looking like.

joyeecheung · 2018-11-16T14:28:15Z

@refack If I remove the static char data, they look like this: https://gist.github.com/joyeecheung/9b41e0b9309d9492270ba23a92e2733d

joyeecheung · 2018-11-16T14:30:56Z

Thanks for the reviews, updated.

CI: https://ci.nodejs.org/job/node-test-pull-request/18673/
Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/276/

refack · 2018-11-16T14:53:36Z

@refack If I remove the static char data, they look like this: gist.github.com/joyeecheung/9b41e0b9309d9492270ba23a92e2733d

IMO they look much better than before 😄

joyeecheung · 2018-11-16T15:23:08Z

Rebased: https://ci.nodejs.org/job/node-test-pull-request/18675/

joyeecheung · 2018-11-16T16:43:10Z

So...interesting enough, this uncovers a circular dependency in internal/streams/lazy_transform.js on the pis - since now the order of keys in internalBinding('native_module').source depends on the implementation of unordered_map (previously it depends on the order by which js2c inserts the strings into the v8::Object), now on the pis somehow lazy_transform.js can be required before crypto.js is required which triggers the dependency issue.

I opened #24396 to fix this.

TimothyGu · 2018-11-16T17:39:54Z

src/node_native_module.cc

Is the return value cached / should it be cached? Right now it seems like every time internalBinding('native_module').source is accessed a new object is created.

It's not cached, but that does not really matter since this is only used by tools/generate_code_cache.js, the native module loader in the binary only gets to call into a C++ function that can access the std::unordered_map directly.

joyeecheung · 2018-11-16T21:27:15Z

So...interesting: std::unordered_map is slower than std::map in this case

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/275

                                                                                   confidence improvement accuracy (*)   (**)  (***)
 misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1                -0.95 %       ±1.05% ±1.40% ±1.82%
 misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                      *      1.46 %       ±1.23% ±1.64% ±2.14%
 misc/startup.js mode='worker' script='benchmark/fixtures/require-cachable' dur=1                 -0.35 %       ±0.85% ±1.13% ±1.47%
 misc/startup.js mode='worker' script='test/fixtures/semicolon' dur=1                             -0.09 %       ±1.06% ±1.41% ±1.83%

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/276

                                                                                   confidence improvement accuracy (*)   (**)  (***)
 misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1         **     -1.58 %       ±1.18% ±1.57% ±2.04%
 misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                             0.13 %       ±1.37% ±1.82% ±2.37%
 misc/startup.js mode='worker' script='benchmark/fixtures/require-cachable' dur=1         ***     -2.43 %       ±0.86% ±1.14% ±1.49%
 misc/startup.js mode='worker' script='test/fixtures/semicolon' dur=1                              0.12 %       ±0.87% ±1.16% ±1.51%

I tried to google around and turns out that's not quite surprising, given that we only have < 200 items, and we only insert items and don't touch them once the map is initialized, hashing on the strings could be slower than searching in the tree. e.g. See https://stackoverflow.com/questions/4846798/why-would-map-be-much-faster-than-unordered-map (this almost tempted me to try implementing a trie..nah)

Trott · 2018-11-18T16:41:19Z

Removing blocked label because 413fcad landed.

refack · 2018-11-18T17:14:11Z

CI (rebased on master): https://ci.nodejs.org/job/node-test-pull-request/18724/

TimothyGu

LGTM other than some fairly superficial comments.

TimothyGu · 2018-11-18T20:29:51Z

tools/generate_code_cache.js

Use isUint8Array from util.types?

TimothyGu · 2018-11-18T20:30:11Z

tools/generate_code_cache.js

This change should be unneeded right?

@TimothyGu Yeah, just so that it's clearer since this length is only used to calculate a string displayed in KB/MB etc. human readable format.

I'll switch this to const size = cachedData.byteLength - I think that makes this a bit more self-explanatory and less ambiguous

TimothyGu · 2018-11-18T20:30:39Z

src/node.cc

nit: bootstrap

TimothyGu · 2018-11-18T20:34:16Z

src/node_native_module.cc

I’d make this a method, as it signals that getting the value is a relatively expensive operation (and that the returned object is different every time this is called).

Ditto for GetCacheUsage().

Instead of putting the source code and the cache in v8::Objects, put them in per-process std::maps. This has the following benefits: - It's slightly lighter in weight compared to storing things on the v8 heap. Also it may be slightly faster since the preivous v8::Object is already in dictionary mode - though the difference is very small given the number of native modules is limited. - The source and code cache generation templates are now much simpler since they just initialize static arrays and manipulate STL constructs. They are also easier to debug from the C++'s side, especially early in the bootstrap process when no inspector can be attached. - The static native module data can be accessed independent of any Environment or Isolate, and it's easy to look them up from the C++'s side. - It's now impossible to mutate the source code used to compile native modules from the JS land since it's completely separate from the v8 heap. We can still get the constant strings from process.binding('natives') but that's all. A few drive-by fixes: - Remove DecorateErrorStack in LookupAndCompile - We don't need to capture the exception to decorate when we encounter errors during native module compilation, as those errors should be syntax errors and v8 is able to decorate them well. We use CompileFunctionInContext so there is no need to worry about wrappers either. - The code cache could be rejected when node is started with v8 flags. Instead of aborting in that case, simply keep a record in the native_module_without_cache set. - Refactor js2c.py a bit, reduce code duplication and inline Render() to make the one-byte/two-byte special treatment easier to read.

… data

joyeecheung · 2018-11-19T14:26:58Z

Fixed a few more nits & rebased & squashed the fixup into one. CI: https://ci.nodejs.org/job/node-test-pull-request/18758/

joyeecheung · 2018-11-19T17:24:23Z

Landed in 7778c03, thanks!

Instead of putting the source code and the cache in v8::Objects, put them in per-process std::maps. This has the following benefits: - It's slightly lighter in weight compared to storing things on the v8 heap. Also it may be slightly faster since the preivous v8::Object is already in dictionary mode - though the difference is very small given the number of native modules is limited. - The source and code cache generation templates are now much simpler since they just initialize static arrays and manipulate STL constructs. - The static native module data can be accessed independently of any Environment or Isolate, and it's easy to look them up from the C++'s side. - It's now impossible to mutate the source code used to compile native modules from the JS land since it's completely separate from the v8 heap. We can still get the constant strings from process.binding('natives') but that's all. A few drive-by fixes: - Remove DecorateErrorStack in LookupAndCompile - We don't need to capture the exception to decorate when we encounter errors during native module compilation, as those errors should be syntax errors and v8 is able to decorate them well. We use CompileFunctionInContext so there is no need to worry about wrappers either. - The code cache could be rejected when node is started with v8 flags. Instead of aborting in that case, simply keep a record in the native_module_without_cache set. - Refactor js2c.py a bit, reduce code duplication and inline Render() to make the one-byte/two-byte special treatment easier to read. PR-URL: #24384 Fixes: https://github.com/Remove Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

joyeecheung · 2018-11-19T19:05:01Z

This should be backported together with #24382 and #24382 (for 10.x at least, I am not sure if we need to backport this to 8.x? Mostly because I don't know if at this point 8.x patches are backported on an as-needed basis or we are still trying to backport everything possible onto 8.x)

Instead of putting the source code and the cache in v8::Objects, put them in per-process std::maps. This has the following benefits: - It's slightly lighter in weight compared to storing things on the v8 heap. Also it may be slightly faster since the preivous v8::Object is already in dictionary mode - though the difference is very small given the number of native modules is limited. - The source and code cache generation templates are now much simpler since they just initialize static arrays and manipulate STL constructs. - The static native module data can be accessed independently of any Environment or Isolate, and it's easy to look them up from the C++'s side. - It's now impossible to mutate the source code used to compile native modules from the JS land since it's completely separate from the v8 heap. We can still get the constant strings from process.binding('natives') but that's all. A few drive-by fixes: - Remove DecorateErrorStack in LookupAndCompile - We don't need to capture the exception to decorate when we encounter errors during native module compilation, as those errors should be syntax errors and v8 is able to decorate them well. We use CompileFunctionInContext so there is no need to worry about wrappers either. - The code cache could be rejected when node is started with v8 flags. Instead of aborting in that case, simply keep a record in the native_module_without_cache set. - Refactor js2c.py a bit, reduce code duplication and inline Render() to make the one-byte/two-byte special treatment easier to read. PR-URL: #24384 Fixes: https://github.com/Remove Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

nodejs/node#24384

Instead of putting the source code and the cache in v8::Objects, put them in per-process std::maps. This has the following benefits: - It's slightly lighter in weight compared to storing things on the v8 heap. Also it may be slightly faster since the preivous v8::Object is already in dictionary mode - though the difference is very small given the number of native modules is limited. - The source and code cache generation templates are now much simpler since they just initialize static arrays and manipulate STL constructs. - The static native module data can be accessed independently of any Environment or Isolate, and it's easy to look them up from the C++'s side. - It's now impossible to mutate the source code used to compile native modules from the JS land since it's completely separate from the v8 heap. We can still get the constant strings from process.binding('natives') but that's all. A few drive-by fixes: - Remove DecorateErrorStack in LookupAndCompile - We don't need to capture the exception to decorate when we encounter errors during native module compilation, as those errors should be syntax errors and v8 is able to decorate them well. We use CompileFunctionInContext so there is no need to worry about wrappers either. - The code cache could be rejected when node is started with v8 flags. Instead of aborting in that case, simply keep a record in the native_module_without_cache set. - Refactor js2c.py a bit, reduce code duplication and inline Render() to make the one-byte/two-byte special treatment easier to read. PR-URL: nodejs#24384 Fixes: https://github.com/Remove Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com> Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Nov 15, 2018

addaleax approved these changes Nov 15, 2018

View reviewed changes

refack reviewed Nov 16, 2018

View reviewed changes

src/node_native_module.h Outdated Show resolved Hide resolved

refack reviewed Nov 16, 2018

View reviewed changes

tools/generate_code_cache.js Outdated Show resolved Hide resolved

refack reviewed Nov 16, 2018

View reviewed changes

fhinkel approved these changes Nov 16, 2018

View reviewed changes

joyeecheung force-pushed the map branch from d26e166 to 9b60fd6 Compare November 16, 2018 14:05

joyeecheung force-pushed the map branch from 9b60fd6 to 5a81e0d Compare November 16, 2018 15:22

joyeecheung added the blocked PRs that are blocked by other issues or PRs. label Nov 16, 2018

joyeecheung mentioned this pull request Nov 16, 2018

stream: do not use crypto.DEFAULT_ENCODING in lazy_transform.js #24396

Closed

3 tasks

TimothyGu reviewed Nov 16, 2018

View reviewed changes

joyeecheung force-pushed the map branch from 470d91d to e1c5eb8 Compare November 17, 2018 02:13

Trott removed the blocked PRs that are blocked by other issues or PRs. label Nov 18, 2018

TimothyGu approved these changes Nov 18, 2018

View reviewed changes

jasnell approved these changes Nov 19, 2018

View reviewed changes

joyeecheung added 2 commits November 19, 2018 19:46

fixup! src: use STL containers instead of v8 values for static module…

161c9f8

… data

joyeecheung closed this Nov 19, 2018

joyeecheung added the backport-requested-v10.x label Nov 19, 2018

BridgeAR mentioned this pull request Dec 5, 2018

v11.4.0 proposal #24854

Merged

4 tasks

deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018

fix: Use per process native module loader for compiled JS source

20b2270

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018

fix: Use per process native module loader for compiled JS source

268e8f6

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018

fix: Use per process native module loader for compiled JS source

4fa0196

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 20, 2018

fix: Use per process native module loader for compiled JS source

ae725c6

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 24, 2018

fix: Use per process native module loader for compiled JS source

e2bb541

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 24, 2018

fix: Use per process native module loader for compiled JS source

d4c8b97

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 25, 2018

fix: Use per process native module loader for compiled JS source

34c3f01

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Dec 26, 2018

fix: Use per process native module loader for compiled JS source

6bd8872

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Jan 3, 2019

fix: Use per process native module loader for compiled JS source

05c1ca2

nodejs/node#24384

nornagon pushed a commit to electron/electron that referenced this pull request Jan 3, 2019

fix: Use per process native module loader for compiled JS source

72bccaa

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Jan 4, 2019

fix: Use per process native module loader for compiled JS source

d05ad20

nodejs/node#24384

deepak1556 added a commit to deepak1556/atom-shell that referenced this pull request Jan 8, 2019

fix: Use per process native module loader for compiled JS source

65a1776

nodejs/node#24384

deepak1556 added a commit to electron/electron that referenced this pull request Jan 10, 2019

fix: Use per process native module loader for compiled JS source

b652958

nodejs/node#24384

nornagon pushed a commit to electron/electron that referenced this pull request Jan 11, 2019

fix: Use per process native module loader for compiled JS source

3a3075b

nodejs/node#24384

nornagon pushed a commit to electron/electron that referenced this pull request Jan 11, 2019

fix: Use per process native module loader for compiled JS source

e3be342

nodejs/node#24384

Uh oh!

src: use STL containers instead of v8 values for static module data #24384

src: use STL containers instead of v8 values for static module data #24384

Uh oh!

Conversation

joyeecheung commented Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

nodejs-github-bot commented Nov 15, 2018

Uh oh!

joyeecheung commented Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joyeecheung commented Nov 15, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyeecheung Nov 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

refack left a comment

Choose a reason for hiding this comment

Uh oh!

refack commented Nov 16, 2018

Uh oh!

joyeecheung commented Nov 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joyeecheung commented Nov 16, 2018

Uh oh!

refack commented Nov 16, 2018

Uh oh!

joyeecheung commented Nov 16, 2018

Uh oh!

joyeecheung commented Nov 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyeecheung Nov 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyeecheung commented Nov 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Trott commented Nov 18, 2018

Uh oh!

refack commented Nov 18, 2018

Uh oh!

TimothyGu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyeecheung Nov 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyeecheung commented Nov 19, 2018

Uh oh!

joyeecheung commented Nov 19, 2018

Uh oh!

joyeecheung commented Nov 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

joyeecheung commented Nov 15, 2018 •

edited

Loading

joyeecheung commented Nov 15, 2018 •

edited

Loading

joyeecheung Nov 16, 2018 •

edited

Loading

joyeecheung commented Nov 16, 2018 •

edited

Loading

joyeecheung Nov 16, 2018 •

edited

Loading

joyeecheung commented Nov 16, 2018 •

edited

Loading

joyeecheung Nov 19, 2018 •

edited

Loading

joyeecheung commented Nov 19, 2018 •

edited

Loading