-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245) #18051
Conversation
…e 17245) *Please note:* This removes top level await from the GENERIC builds of the PDF.js library. Despite top level await being supported in all modern browsers/environments, note [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await#browser_compatibility), it seems that many frameworks and build-tools unfortunately have trouble with it. Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this. Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment. The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the `getDocument`-function. Unfortunately that doesn't work though, since that's a *synchronous* function that we cannot change without breaking "the world". Hence we instead await the load of Node.js packages/polyfills together with the `PDFWorker` initialization, since that's the *first point* of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen *until* the worker has been initialized. Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point. This new pattern for accessing Node.js packages/polyfills will also require some care during development *and* importantly reviewing, to ensure that no new top level await is added in the main code-base.
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/79c920cab5e0c78/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.241.84.105:8877/79c920cab5e0c78/output.txt Total script time: 1.18 mins Published |
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.193.163.58:8877/eb44d0fc565a207/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/611890a4803f63b/output.txt |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/eb44d0fc565a207/output.txt Total script time: 2.78 mins
Image differences available at: http://54.193.163.58:8877/eb44d0fc565a207/reftest-analyzer.html#web=eq.log |
/botio-windows test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.193.163.58:8877/7681059f7d72157/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/611890a4803f63b/output.txt Total script time: 27.53 mins
Image differences available at: http://54.241.84.105:8877/611890a4803f63b/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/7681059f7d72157/output.txt Total script time: 60.00 mins |
/botio-windows test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.193.163.58:8877/c70924c55c6c59f/output.txt |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/c70924c55c6c59f/output.txt Total script time: 60.08 mins |
Node.js is working on an API to synchronously load its built-in modules, so in the future conditioanally loading fs&friends won't need to be async anymore :) |
That looks promising, thanks a lot for the info! However, even if that was available today it'd not avoid us having to add the It would obviously help lessen the amount of Node.js functionality that we needed to load asynchronously, but this new feature hasn't landed yet and from the outside I cannot tell if/when that will happen. [1] Looking at https://github.com/nodejs/release#release-schedule Node.js 18 and 20 are officially supported for (almost) one respectively two more years. |
Well, you can do
The TypeScript team is pushing for that to land so that they can migrate to native ESM without async conditional imports for Node.js, so hopefully it'll get backported. I'll ping you if it happens :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two small questions, but other than that this looks good to me.
This global was only introduced to work-around problems caused by the GENERIC PDF.js build using top level await. Since that was removed in the previous commit, this global is now dead code.
a23f159
to
9418ed1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me, with passing Windows tests once the bot is working again. Thank you!
Great to see that this issue is being worked on! As I'm still struggling to understand how the different pieces of the project fit together, I quickly wanted to ask if this fix will also get rid of the following top-level await inside
|
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 1 Live output at: http://54.241.84.105:8877/641ae644a87aa33/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 1 Live output at: http://54.193.163.58:8877/a8c63b7148c0cb3/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/641ae644a87aa33/output.txt Total script time: 27.70 mins
Image differences available at: http://54.241.84.105:8877/641ae644a87aa33/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/a8c63b7148c0cb3/output.txt Total script time: 42.21 mins
Image differences available at: http://54.193.163.58:8877/a8c63b7148c0cb3/reftest-analyzer.html#web=eq.log |
All test "failures" look like (known) intermittent ones, hence landing this given the previous review. |
This comment was marked as resolved.
This comment was marked as resolved.
Yes, I'm planning to make a new release at the end of the month indeed. |
@timvandermeij , was this released in the end? Thanks, |
Yes, in v4.3.136. |
Please note: This removes top level await from the GENERIC builds of the PDF.js library.
Despite top level await being supported in all modern browsers/environments, note the MDN compatibility data, it seems that many frameworks and build-tools unfortunately have trouble with it.
Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this.
Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment.
The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the
getDocument
-function. Unfortunately that doesn't work though, since that's a synchronous function that we cannot change without breaking "the world".Hence we instead await the load of Node.js packages/polyfills together with the
PDFWorker
initialization, since that's the first point of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen until the worker has been initialized.Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point.
This new pattern for accessing Node.js packages/polyfills will also require some care during development and importantly reviewing, to ensure that no new top level await is added in the main code-base.