Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto: add Hash.prototype.copy() method #29910

Closed
wants to merge 2 commits into from

Conversation

bnoordhuis
Copy link
Member

Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Oct 9, 2019
@bnoordhuis bnoordhuis added the crypto Issues and PRs related to the crypto subsystem. label Oct 9, 2019
@nodejs-github-bot
Copy link
Collaborator

@overlookmotel
Copy link

Author of #29903 here. I wasn't able to help with implementation of this feature as I don't know C/C++. But if anything I can do here on docs or testing, or anything involving JS, please shout. Really appreciate this being added to Node.

Copy link

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be good to add to end of hash.digest() docs:

Use hash.copy() first if you wish to continue to add data to the hash after hash.digest().

of the current `Hash` object.

The optional `options` argument controls stream behavior. For XOF hash
functions such as `'shake256'`, the `outputLength` option can be used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://nodejs.org/api/stream.html#stream_new_stream_transform_options doesn't have an outputLength option!

It looks like the options are "whatever crypto.createHash() accepts", and that the crypto.createHash docs are wrong, https://nodejs.org/api/crypto.html#crypto_crypto_createhash_algorithm_options, and got copied here.

You could fix the createHash docs, but that's unrelated to this feature, perhaps just change the docs here to link to them and say "same options as over there", so when the createHash docs get fixed this will be fixed, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't think it's too confusing but I can split it off into a separate issue if you like.

doc/api/crypto.md Outdated Show resolved Hide resolved
@tniessen
Copy link
Member

Could this suffer from problems similar to #28245 depending on OpenSSL internals, since we don't always set state[kFinalized]?

@bnoordhuis
Copy link
Member Author

Could this suffer from problems similar to #28245 depending on OpenSSL internals, since we don't always set state[kFinalized]?

Interesting question. I don't believe it actually hurts to copy a finalized hash object, it's just pointless (and that's why I added the guard.)

@nodejs-github-bot

This comment has been minimized.

Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: nodejs#29903
@bnoordhuis
Copy link
Member Author

Example added to the documentation.

Use hash.copy() first if you wish to continue to add data to the hash after hash.digest().

I didn't use that because it can be construed as meaning that copy() after digest() should work when it doesn't.


Food for thought: we could also make digest() omnipotent by cloning the hash object internally. I.e., this would work:

const hash = crypto.createHash('sha256');
const one = hash.update('one').digest();
const two = hash.update('two').digest();

The downside is that it imposes a (possibly small) performance penalty on single-use hash objects.

@overlookmotel
Copy link

@bnoordhuis Only problem with that is that some people feel that the hash object should be reusable in that its internal state is reset after a call to .digest() (#25857). Personally, that doesn't make sense to me, but it could cause confusion it seems.

It's not so hard to call .copy() explicitly I think, and some people will blanche at even a small performance cost.

@tniessen
Copy link
Member

I don't think it is a good idea to implicitely reset the internal state of the object in .digest(), it will only lead to bugs and problems.

.copy() behaves differently than what you are describing, it does not reset the state, instead, it allows to obtain a second Hash object with the same internal state.

@nodejs-github-bot
Copy link
Collaborator

@overlookmotel
Copy link

@tniessen Ah sorry I wasn't clear. I understand this is not what .copy() does. What I was trying to say is that there are seemingly two different behaviors that people might expect if you are allowed to call .digest() more than once on a hash.

With this code:

const hash = crypto.createHash('md5');
hash.update('abc');
hash.digest('hex');

hash.update('def');
const res = hash.digest('hex');

res could be expected to be either:

  1. Hash of "abcdef" (rolling hash)
  2. Hash of "def" (1st call to .digest() reset hash)

My expectation would be (1), but it seems from #25857 that some people would expect (2). Personally I think there's little merit to the arguments made by the OP in that issue. But since there are seemingly different opinions, it could lead to confusion and bugs as you say.

So I was trying to say: probably it's safer to keep behavior of .digest() as it is, to avoid any ambiguity.

Hope that make more sense.

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented Oct 16, 2019

@Trott Trott added the semver-minor PRs that contain new features and should be released in the next minor version. label Oct 16, 2019
@Trott
Copy link
Member

Trott commented Oct 16, 2019

Landed in 9f203f9

@Trott Trott closed this Oct 16, 2019
Trott pushed a commit that referenced this pull request Oct 16, 2019
Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

PR-URL: #29910
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@overlookmotel
Copy link

Fantastic. Thanks very much @bnoordhuis for implementing this and everyone else for reviewing.

Can I ask: Is this likely to get into a Node 12.x release before it goes LTS?

@Trott
Copy link
Member

Trott commented Oct 16, 2019

Can I ask: Is this likely to get into a Node 12.x release before it goes LTS?

Probably not before it goes LTS but probably the next release after that one. /ping @nodejs/releasers in case I'm wrong about that.

@overlookmotel
Copy link

Ah OK great. My mistake - I was confusing the Active and Maintenance phases in terms of when new features can be added, so was wrongly thinking this might not land in an LTS release line until 14.x.

@tniessen
Copy link
Member

@overlookmotel Thanks for the clarification, I absolutely agree. Resetting the state in .digest() (or silently copying the state) will only lead to confusion.

@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Oct 17, 2019

It seems the YAML processing is broken for this doc section. IIRC, we usually use this structure with PR URL field only in history entries.

addedin

Trott added a commit to Trott/io.js that referenced this pull request Oct 18, 2019
This fixes YAML that gets incorrectly processed by our tooling.

Refs: nodejs#29910 (comment)

PR-URL: nodejs#30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
MylesBorins pushed a commit that referenced this pull request Oct 23, 2019
Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

PR-URL: #29910
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Oct 23, 2019
This fixes YAML that gets incorrectly processed by our tooling.

Refs: #29910 (comment)

PR-URL: #30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
MylesBorins pushed a commit that referenced this pull request Oct 23, 2019
Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

PR-URL: #29910
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Oct 23, 2019
This fixes YAML that gets incorrectly processed by our tooling.

Refs: #29910 (comment)

PR-URL: #30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
targos added a commit that referenced this pull request Nov 5, 2019
Notable changes:

* cli:
  * Added a new flag (`--trace-uncaught`) that makes Node.js print the
    stack trace at the time of throwing uncaught exceptions, rather than
    at the creation of the `Error` object, if there is any. This is
    disabled by default because it affects GC behavior.
    #30025
* crypto
  * Added `Hash.prototype.copy()` method. It returns a new `Hash` object
    with its internal state cloned from the original one.
    #29910
* dgram
  * Added source-specific multicast support. This adds methods to
    Datagram sockets to support RFC 4607
    (https://tools.ietf.org/html/rfc4607) for IPv4 and IPv6.
    #15735
* fs
  * Added a `bufferSize` option to `fs.opendir()`. It allows to control
    the number of entries that are buffered internally when reading from
    the directory. #30114
* meta
  * Added Chengzhong Wu (https://github.com/legendecas) to
    collaborators. #30115

PR-URL: #30262
@targos targos mentioned this pull request Nov 5, 2019
targos added a commit that referenced this pull request Nov 5, 2019
Notable changes:

* cli:
  * Added a new flag (`--trace-uncaught`) that makes Node.js print the
    stack trace at the time of throwing uncaught exceptions, rather than
    at the creation of the `Error` object, if there is any. This is
    disabled by default because it affects GC behavior.
    #30025
* crypto
  * Added `Hash.prototype.copy()` method. It returns a new `Hash` object
    with its internal state cloned from the original one.
    #29910
* dgram
  * Added source-specific multicast support. This adds methods to
    Datagram sockets to support RFC 4607
    (https://tools.ietf.org/html/rfc4607) for IPv4 and IPv6.
    #15735
* fs
  * Added a `bufferSize` option to `fs.opendir()`. It allows to control
    the number of entries that are buffered internally when reading from
    the directory. #30114
* meta
  * Added Chengzhong Wu (https://github.com/legendecas) to
    collaborators. #30115

PR-URL: #30262
@overlookmotel
Copy link

@bnoordhuis Hi. I wondered if there were any plans to backport this to Node 12?

NB This is not a demand, but a question. No is an acceptable answer!

@tniessen
Copy link
Member

tniessen commented Dec 1, 2019

@overlookmotel It will likely be included in one of the next 12.x releases :)

@overlookmotel
Copy link

@tniessen Thank you! Can't wait to get my hands on it...

targos pushed a commit that referenced this pull request Jan 8, 2020
Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

PR-URL: #29910
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this pull request Jan 8, 2020
This fixes YAML that gets incorrectly processed by our tooling.

Refs: #29910 (comment)

PR-URL: #30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
targos pushed a commit that referenced this pull request Jan 8, 2020
This fixes YAML that gets incorrectly processed by our tooling.

Refs: #29910 (comment)

PR-URL: #30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
BethGriggs pushed a commit that referenced this pull request Feb 6, 2020
Make it possible to clone the internal state of a Hash object
into a new Hash object, i.e., to fork the state of the object.

Fixes: #29903

PR-URL: #29910
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: David Carlier <devnexen@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this pull request Feb 6, 2020
This fixes YAML that gets incorrectly processed by our tooling.

Refs: #29910 (comment)

PR-URL: #30016
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
@MylesBorins MylesBorins mentioned this pull request Feb 8, 2020
targos pushed a commit that referenced this pull request Feb 11, 2020
Notable changes:

New assert APIs

The `assert` module now provides experimental `assert.match()` and
`assert.doesNotMatch()` methods. They will validate that the first argument is a
string and matches (or does not match) the provided regular expression

This is an experimental feature.

Ruben Bridgewater [#30929](#30929).

Advanced serialization for IPC

The `child_process` and `cluster` modules now support a `serialization` option
to change the serialization mechanism used for IPC. The option can have one of
two values:

* `'json'` (default): `JSON.stringify()` and `JSON.parse()` are used. This is
  how message serialization was done before.
* `'advanced'`: The serialization API of the `v8` module is used. It is based on
  the HTML structured clone algorithm.
  and is able to serialize more built-in JavaScript object types, such as
  `BigInt`, `Map`, `Set` etc. as well as circular data structures.

Anna Henningsen [#30162](#30162).

CLI flags

The new `--trace-exit` CLI flag makes Node.js print a stack trace whenever the
Node.js environment is exited proactively (i.e. by invoking the `process.exit()`
function or pressing Ctrl+C).

legendecas [#30516](#30516).

___

The new `--trace-uncaught` CLI flag makes Node.js print a stack trace at the
time of throwing uncaught exceptions, rather than at the creation of the `Error`
object, if there is any.
This option is not enabled by default because it may affect garbage collection
behavior negatively.

Anna Henningsen [#30025](#30025).

___

The `--disallow-code-generation-from-strings` V8 CLI flag is now whitelisted in
the `NODE_OPTIONS` environment variable.

Shelley Vohr [#30094](#30094).

New crypto APIs

For DSA and ECDSA, a new signature encoding is now supported in addition to the
existing one (DER). The `verify` and `sign` methods accept a `dsaEncoding`
option, which can have one of two values:

* `'der'` (default): DER-encoded ASN.1 signature structure encoding `(r, s)`.
* `'ieee-p1363'`: Signature format `r || s` as proposed in IEEE-P1363.

Tobias Nießen [#29292](#29292).

___

A new method was added to `Hash`: `Hash.prototype.copy`. It makes it possible to
clone the internal state of a `Hash` object into a new `Hash` object, allowing
to compute the digest between updates.

Ben Noordhuis [#29910](#29910).

Dependency updates

libuv was updated to 1.34.0. This includes fixes to `uv_fs_copyfile()` and
`uv_interface_addresses()` and adds two new functions: `uv_sleep()` and
`uv_fs_mkstemp()`.

Colin Ihrig [#30783](#30783).

___

V8 was updated to 7.8.279.23. This includes performance improvements to object
destructuring, RegExp match failures and WebAssembly startup time.
The official release notes are available at https://v8.dev/blog/v8-release-78.

Michaël Zasso [#30109](#30109).

New EventEmitter APIs

The new `EventEmitter.on` static method allows to async iterate over events.

Matteo Collina [#27994](#27994).

___

It is now possible to monitor `'error'` events on an `EventEmitter` without
consuming the emitted error by installing a listener using the symbol
`EventEmitter.errorMonitor`.

Gerhard Stoebich [#30932](#30932).

___

Using `async` functions with event handlers is problematic, because it
can lead to an unhandled rejection in case of a thrown exception.

The experimental `captureRejections` option in the `EventEmitter` constructor or
the global setting change this behavior, installing a
`.then(undefined, handler)` handler on the `Promise`. This handler routes the
exception asynchronously to the `Symbol.for('nodejs.rejection')` method if there
is one, or to the `'error'` event handler if there is none.

Setting `EventEmitter.captureRejections = true` will change the default for all
new instances of `EventEmitter`.

This is an experimental feature.

Matteo Collina [#27867](#27867).

Performance Hooks are no longer experimental

The `perf_hooks` module is now considered a stable API.

legendecas [#31101](#31101).

Introduction of experimental WebAssembly System Interface (WASI) support

A new core module, `wasi`, is introduced to provide an implementation of the
[WebAssembly System Interface](https://wasi.dev/) specification.
WASI gives sandboxed WebAssembly applications access to the
underlying operating system via a collection of POSIX-like functions.

This is an experimental feature.

Colin Ihrig [#30258](#30258).

PR-URL: #31691
targos pushed a commit that referenced this pull request Feb 11, 2020
Notable changes:

New assert APIs

The `assert` module now provides experimental `assert.match()` and
`assert.doesNotMatch()` methods. They will validate that the first argument is a
string and matches (or does not match) the provided regular expression

This is an experimental feature.

Ruben Bridgewater [#30929](#30929).

Advanced serialization for IPC

The `child_process` and `cluster` modules now support a `serialization` option
to change the serialization mechanism used for IPC. The option can have one of
two values:

* `'json'` (default): `JSON.stringify()` and `JSON.parse()` are used. This is
  how message serialization was done before.
* `'advanced'`: The serialization API of the `v8` module is used. It is based on
  the HTML structured clone algorithm.
  and is able to serialize more built-in JavaScript object types, such as
  `BigInt`, `Map`, `Set` etc. as well as circular data structures.

Anna Henningsen [#30162](#30162).

CLI flags

The new `--trace-exit` CLI flag makes Node.js print a stack trace whenever the
Node.js environment is exited proactively (i.e. by invoking the `process.exit()`
function or pressing Ctrl+C).

legendecas [#30516](#30516).

___

The new `--trace-uncaught` CLI flag makes Node.js print a stack trace at the
time of throwing uncaught exceptions, rather than at the creation of the `Error`
object, if there is any.
This option is not enabled by default because it may affect garbage collection
behavior negatively.

Anna Henningsen [#30025](#30025).

___

The `--disallow-code-generation-from-strings` V8 CLI flag is now whitelisted in
the `NODE_OPTIONS` environment variable.

Shelley Vohr [#30094](#30094).

New crypto APIs

For DSA and ECDSA, a new signature encoding is now supported in addition to the
existing one (DER). The `verify` and `sign` methods accept a `dsaEncoding`
option, which can have one of two values:

* `'der'` (default): DER-encoded ASN.1 signature structure encoding `(r, s)`.
* `'ieee-p1363'`: Signature format `r || s` as proposed in IEEE-P1363.

Tobias Nießen [#29292](#29292).

___

A new method was added to `Hash`: `Hash.prototype.copy`. It makes it possible to
clone the internal state of a `Hash` object into a new `Hash` object, allowing
to compute the digest between updates.

Ben Noordhuis [#29910](#29910).

Dependency updates

libuv was updated to 1.34.0. This includes fixes to `uv_fs_copyfile()` and
`uv_interface_addresses()` and adds two new functions: `uv_sleep()` and
`uv_fs_mkstemp()`.

Colin Ihrig [#30783](#30783).

___

V8 was updated to 7.8.279.23. This includes performance improvements to object
destructuring, RegExp match failures and WebAssembly startup time.
The official release notes are available at https://v8.dev/blog/v8-release-78.

Michaël Zasso [#30109](#30109).

New EventEmitter APIs

The new `EventEmitter.on` static method allows to async iterate over events.

Matteo Collina [#27994](#27994).

___

It is now possible to monitor `'error'` events on an `EventEmitter` without
consuming the emitted error by installing a listener using the symbol
`EventEmitter.errorMonitor`.

Gerhard Stoebich [#30932](#30932).

___

Using `async` functions with event handlers is problematic, because it
can lead to an unhandled rejection in case of a thrown exception.

The experimental `captureRejections` option in the `EventEmitter` constructor or
the global setting change this behavior, installing a
`.then(undefined, handler)` handler on the `Promise`. This handler routes the
exception asynchronously to the `Symbol.for('nodejs.rejection')` method if there
is one, or to the `'error'` event handler if there is none.

Setting `EventEmitter.captureRejections = true` will change the default for all
new instances of `EventEmitter`.

This is an experimental feature.

Matteo Collina [#27867](#27867).

Performance Hooks are no longer experimental

The `perf_hooks` module is now considered a stable API.

legendecas [#31101](#31101).

Introduction of experimental WebAssembly System Interface (WASI) support

A new core module, `wasi`, is introduced to provide an implementation of the
[WebAssembly System Interface](https://wasi.dev/) specification.
WASI gives sandboxed WebAssembly applications access to the
underlying operating system via a collection of POSIX-like functions.

This is an experimental feature.

Colin Ihrig [#30258](#30258).

PR-URL: #31691
@overlookmotel
Copy link

Thank you very much for backporting this to Node 12. Great stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. crypto Issues and PRs related to the crypto subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calculating rolling hashes
10 participants