-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to cjs-module-lexer@1.2.1 #38450
Conversation
/cc @nodejs/modules |
Line 1339 in 8780537
|
reading v8/v8@81d168d , do we ensure we don't have unpaired surrogates in the lexer? Luckily even w/ the internal name stuff v8 does they didn't use the same space for the exports so it should be safe |
@bmeck the module load would fail on unpaired surrogates due to the CJS loader failing execution as it wouldn't be valid JS source text. The lexer itself does not bail on unpaired surrogates though (it handles stepping through surrogates but not validating them for performance), so would still return the exports as part of its analysis but this would be unobservable to users. |
My concern is for valid CJS that exports unpaired surrogates like (this does not play with WASM's requirement of valid UTF8): // '\u{D83C}\u{DF10}' is 🌐, 2 surrogates
module.exports = {
'\u{D83C}': 123,
'\uDF10': 456,
}; |
Ah thanks for clarifying. Yes we will parse and support unpaired surrogates just like normal JS since the lexer runs as UTF-16. Is that a problem? |
I can open a PR to i.e. if we merge this PR without disallowing unparied surrogates, I don't think this would violate the spec: // dep.cjs
module.exports = {
'\u{D83C}': 123,
'\uDF10': 456,
}; import * as dep from "./dep.cjs";
dep["\uDF10"]; // 456, not undefined |
The spec was written in a way to explicitly ban them and was generally thought that UTF8 compatibility was desired so WASM could properly integrate against any module JS deals with. If WASM directly imported that CJS module I'm unclear what would happen but it certainly wouldn't be able to use those exports. I'd prefer we disable them just to ensure compatibility unless there is reason to allow them. |
I'll prepare a PR. "disallow" = "ignore it", right? (not throwing) |
@nicolo-ribaudo yes, just drop them/ignore them. As long as they don't show up in the exported names we should be safe. No reason to error/throw. |
@bmeck I've updated to cjs-module-lexer@1.2.1 here with support for unicode escapes in strings and surrogate validation as discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do this~, LGTM
can we get a rocket emoji reaction for a fast track to this since it is trying to sync up w/ native ESM that is out in 16? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSLGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rubber-stamp LGTM
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Landed in 50991df. |
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: #38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
PR-URL: nodejs#38450 Reviewed-By: Bradley Farias <bradley.meck@gmail.com> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Jan Krems <jan.krems@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
This version of cjs-module-lexer@1.2.1 includes support for non-identifier exports thanks to @nicolo-ribaudo.
This will mean
import { "?" as name } from 'cjs'
forexports['?'] = 'export'
can be supported.In versions of Node.js without string import support,
import * as m from 'cjs'; m['?']
can be used instead for these cases as well.Test included to verify the behaviour.