Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245) #18051

Merged
merged 2 commits into from
May 14, 2024

Conversation

Snuffleupagus
Copy link
Collaborator

Please note: This removes top level await from the GENERIC builds of the PDF.js library.

Despite top level await being supported in all modern browsers/environments, note the MDN compatibility data, it seems that many frameworks and build-tools unfortunately have trouble with it.
Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this.

Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment.
The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the getDocument-function. Unfortunately that doesn't work though, since that's a synchronous function that we cannot change without breaking "the world".

Hence we instead await the load of Node.js packages/polyfills together with the PDFWorker initialization, since that's the first point of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen until the worker has been initialized.
Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point.
This new pattern for accessing Node.js packages/polyfills will also require some care during development and importantly reviewing, to ensure that no new top level await is added in the main code-base.

…e 17245)

*Please note:* This removes top level await from the GENERIC builds of the PDF.js library.

Despite top level await being supported in all modern browsers/environments, note [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await#browser_compatibility), it seems that many frameworks and build-tools unfortunately have trouble with it.
Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this.

Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment.
The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the `getDocument`-function. Unfortunately that doesn't work though, since that's a *synchronous* function that we cannot change without breaking "the world".

Hence we instead await the load of Node.js packages/polyfills together with the `PDFWorker` initialization, since that's the *first point* of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen *until* the worker has been initialized.
Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point.
This new pattern for accessing Node.js packages/polyfills will also require some care during development *and* importantly reviewing, to ensure that no new top level await is added in the main code-base.
@Snuffleupagus
Copy link
Collaborator Author

/botio-linux preview

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/79c920cab5e0c78/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/79c920cab5e0c78/output.txt

Total script time: 1.18 mins

Published

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/eb44d0fc565a207/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/611890a4803f63b/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/eb44d0fc565a207/output.txt

Total script time: 2.78 mins

  • Unit tests: FAILED
  • Integration Tests: FAILED
  • Regression tests: FAILED

Image differences available at: http://54.193.163.58:8877/eb44d0fc565a207/reftest-analyzer.html#web=eq.log

@Snuffleupagus
Copy link
Collaborator Author

/botio-windows test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/7681059f7d72157/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/611890a4803f63b/output.txt

Total script time: 27.53 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 18
  different first/second rendering: 3

Image differences available at: http://54.241.84.105:8877/611890a4803f63b/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/7681059f7d72157/output.txt

Total script time: 60.00 mins

@Snuffleupagus
Copy link
Collaborator Author

/botio-windows test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/c70924c55c6c59f/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/c70924c55c6c59f/output.txt

Total script time: 60.08 mins

@nicolo-ribaudo
Copy link
Contributor

nicolo-ribaudo commented May 7, 2024

Node.js is working on an API to synchronously load its built-in modules, so in the future conditioanally loading fs&friends won't need to be async anymore :)
nodejs/node#52762

@Snuffleupagus
Copy link
Collaborator Author

Node.js is working on an API to synchronously load its built-in modules, so in the future conditioanally loading fs&friends won't need to be async anymore :)

That looks promising, thanks a lot for the info!

However, even if that was available today it'd not avoid us having to add the NodePackages functionality in PDF.js since we also need to load a couple of polyfills (i.e. npm packages) and the new Node.js feature would not help us there as far as I understand.

It would obviously help lessen the amount of Node.js functionality that we needed to load asynchronously, but this new feature hasn't landed yet and from the outside I cannot tell if/when that will happen.
There's also the important question of when this'll become generally available, since unless it's uplifted to the current LTS branches (18 and 20) it'd take many years[1] before we could rely on it unfortunately?


[1] Looking at https://github.com/nodejs/release#release-schedule Node.js 18 and 20 are officially supported for (almost) one respectively two more years.

@nicolo-ribaudo
Copy link
Contributor

However, even if that was available today it'd not avoid us having to add the NodePackages functionality in PDF.js since we also need to load a couple of polyfills (i.e. npm packages) and the new Node.js feature would not help us there as far as I understand.

Well, you can do const require = process.getBuiltinModule("module").createRequire(import.meta.url) and then require whatever you want.

It would obviously help lessen the amount of Node.js functionality that we needed to load asynchronously, but this new feature hasn't landed yet and from the outside I cannot tell if/when that will happen.
There's also the important question of when this'll become generally available, since unless it's uplifted to the current LTS branches (18 and 20) it'd take many years[1] before we could rely on it unfortunately?

The TypeScript team is pushing for that to land so that they can migrate to native ESM without async conditional imports for Node.js, so hopefully it'll get backported. I'll ping you if it happens :)

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two small questions, but other than that this looks good to me.

src/display/node_utils.js Show resolved Hide resolved
web/pdfjs.js Show resolved Hide resolved
This global was only introduced to work-around problems caused by the GENERIC PDF.js build using top level await. Since that was removed in the previous commit, this global is now dead code.
Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, with passing Windows tests once the bot is working again. Thank you!

@schuma7
Copy link

schuma7 commented May 11, 2024

Great to see that this issue is being worked on!

As I'm still struggling to understand how the different pieces of the project fit together, I quickly wanted to ask if this fix will also get rid of the following top-level await inside / pdfjs-dist / legacy / web / pdf_viewer.mjs:

/******/ __webpack_exports__ = globalThis.pdfjsViewer = await (globalThis.pdfjsViewerPromise = __webpack_exports__);

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://54.241.84.105:8877/641ae644a87aa33/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://54.193.163.58:8877/a8c63b7148c0cb3/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/641ae644a87aa33/output.txt

Total script time: 27.70 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 19
  different first/second rendering: 2

Image differences available at: http://54.241.84.105:8877/641ae644a87aa33/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/a8c63b7148c0cb3/output.txt

Total script time: 42.21 mins

  • Unit tests: FAILED
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 4

Image differences available at: http://54.193.163.58:8877/a8c63b7148c0cb3/reftest-analyzer.html#web=eq.log

@Snuffleupagus
Copy link
Collaborator Author

All test "failures" look like (known) intermittent ones, hence landing this given the previous review.

@Snuffleupagus Snuffleupagus merged commit 298d721 into mozilla:master May 14, 2024
9 checks passed
@Snuffleupagus Snuffleupagus deleted the NodePackages branch May 14, 2024 09:44
@Snuffleupagus

This comment was marked as resolved.

@timvandermeij
Copy link
Contributor

Yes, I'm planning to make a new release at the end of the month indeed.

@jamesryan-dev
Copy link

jamesryan-dev commented Jun 25, 2024

@timvandermeij , was this released in the end?

Thanks,
J

@nicolo-ribaudo
Copy link
Contributor

Yes, in v4.3.136.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core node-specific release-blocker Blocker for the upcoming release
Projects
None yet
7 participants