Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Catalog.getPageDict to an async method #11312

Merged
merged 2 commits into from
Nov 8, 2019

Conversation

Snuffleupagus
Copy link
Collaborator

This makes it possible to remove the internal next helper function, and also gets rid of the need to manually resolve/reject a PromiseCapability.

This makes it possible to remove the internal `next` helper function, and also gets rid of the need to manually resolve/reject a `PromiseCapability`.
As we've seen in numerous other cases, avoiding unnecessary function calls is never a bad thing (even if the effect is probably tiny here).

With these changes we also avoid potentially two back-to-back `isDict` checks when evaluating possible Page nodes, and can also no longer accidentally pick a dictionary with an incorrect /Type.
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Nov 8, 2019

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/84f317acfc45ce7/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Nov 8, 2019

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b6d98b39c223b9f/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Nov 8, 2019

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/84f317acfc45ce7/output.txt

Total script time: 18.69 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

pdfjsbot commented Nov 8, 2019

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/b6d98b39c223b9f/output.txt

Total script time: 26.77 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij timvandermeij merged commit b1440a1 into mozilla:master Nov 8, 2019
@timvandermeij
Copy link
Contributor

Thank you for refactoring this! I found the next mechanism a bit hard to read.

@Snuffleupagus Snuffleupagus deleted the async-getPageDict branch November 8, 2019 21:58
@Snuffleupagus
Copy link
Collaborator Author

I found the next mechanism a bit hard to read.

Yeah, the old setup was essentially a recursive function which was called asynchronously. This was also a case where the use of a PromiseCapability probably didn't help readability either.

I suppose the one criticism you could have of this patch, is that it's adding asynchronous behaviour inside of a loop. However I think all the skipped function calls and the general clean-up outweighs that.

@Snuffleupagus
Copy link
Collaborator Author

Sigh, it seems that I forgot to test this with really large files :-(

With a large file taken from an old GitHub issue (warning, the file is 16 MB): https://github.com/mozilla/pdf.js/files/876321/kjv.pdf there's now a considerable "Page Request" regression for the second page upon document load; from 2000 -> 3000 ms which obviously doesn't look good.

In light of this I don't believe that it's a good idea keep these changes unfortunately, since the new way in which this code is asynchronous obviously performs worse in some cases; sorry about the mess here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants