Attempt to reduce resource usage, by not eagerly fetching all pages, for long/large documents #11263

Snuffleupagus · 2019-10-20T11:21:52Z

For very long/large documents fetching all pages on load may cause quite bad performance, both memory and CPU wise. In order to at least slightly alleviate this, we can let the viewer treat these kind of documents[1] as if disableAutoFetch were set.

[1] One example of a really bad case is https://bugzilla.mozilla.org/show_bug.cgi?id=1588435, which this patch should at least help somewhat. In general, for these cases, we'd probably need to implement switching between PDFViewer/PDFSinglePageViewer (as already tracked on GitHub) and use the latter for these kind of long documents.

…for long/large documents For *very* long/large documents fetching all pages on load may cause quite bad performance, both memory and CPU wise. In order to at least slightly alleviate this, we can let the viewer treat these kind of documents[1] as if `disableAutoFetch` were set. --- [1] One example of a really bad case is https://bugzilla.mozilla.org/show_bug.cgi?id=1588435, which this patch should at least help somewhat. In general, for these cases, we'd probably need to implement switching between `PDFViewer`/`PDFSinglePageViewer` (as already tracked on GitHub) and use the latter for these kind of long documents.

timvandermeij · 2019-10-20T13:13:41Z

/botio-linux preview

pdfjsbot · 2019-10-20T13:13:42Z

From: Bot.io (Linux m4)

Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://54.67.70.0:8877/d45cea9fb8c89e6/output.txt

pdfjsbot · 2019-10-20T13:15:26Z

From: Bot.io (Linux m4)

Success

Full output at http://54.67.70.0:8877/d45cea9fb8c89e6/output.txt

Total script time: 1.73 mins

Published

Viewer: http://54.67.70.0:8877/d45cea9fb8c89e6/web/viewer.html

timvandermeij · 2019-10-20T13:25:52Z

Thank you! This PDF file is horribly large, so I'm not surprised we're having trouble rendering this properly. I do notice that this patch improves the situation a bit; it's still slow, but skipping to some pages at least makes them render better now.

I noticed that PDF.js doesn't render more than 12656 pages. The last canvas is cut off at the top, so that's also "interesting" about this PDF file... In short, it's good that the issue on Bugzilla is still open.

Snuffleupagus · 2019-10-20T13:36:34Z

Thanks for landing this!

[...] it's still slow, but skipping to some pages at least makes them render better now.

Part of the slowness seems to be related to the file size itself (at 185 MB) and particularly the connection of the server, since having the file available locally also seem to help a bit.

I noticed that PDF.js doesn't render more than 12656 pages.

If I were to guess, that's probably the browser choking on the sheer number of DOM elements, rather than anything else. (With PDFSinglePageViewer there's no problem accessing later pages.)

timvandermeij added the viewer label Oct 20, 2019

timvandermeij merged commit d7f651a into mozilla:master Oct 20, 2019

Snuffleupagus deleted the viewer-disableAutoFetch-pagesCount branch October 20, 2019 18:21

Snuffleupagus mentioned this pull request Nov 29, 2021

Enforce PAGE-scrolling for *very* large/long documents (bug 1588435, PR 11263 follow-up) #14324

Merged

Snuffleupagus mentioned this pull request Dec 10, 2021

Avoid overloading the worker-thread during eager page initialization in the viewer (PR 11263 follow-up) #14359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to reduce resource usage, by not eagerly fetching all pages, for long/large documents #11263

Attempt to reduce resource usage, by not eagerly fetching all pages, for long/large documents #11263

Snuffleupagus commented Oct 20, 2019

timvandermeij commented Oct 20, 2019

pdfjsbot commented Oct 20, 2019

pdfjsbot commented Oct 20, 2019

timvandermeij commented Oct 20, 2019 •

edited

Loading

Snuffleupagus commented Oct 20, 2019

Attempt to reduce resource usage, by not eagerly fetching all pages, for long/large documents #11263

Attempt to reduce resource usage, by not eagerly fetching all pages, for long/large documents #11263

Conversation

Snuffleupagus commented Oct 20, 2019

timvandermeij commented Oct 20, 2019

pdfjsbot commented Oct 20, 2019

From: Bot.io (Linux m4)

Received

pdfjsbot commented Oct 20, 2019

From: Bot.io (Linux m4)

Success

Published

timvandermeij commented Oct 20, 2019 • edited Loading

Snuffleupagus commented Oct 20, 2019

timvandermeij commented Oct 20, 2019 •

edited

Loading