Ensure that the entire PDF document is loaded before we begin saving it #16941

Snuffleupagus · 2023-09-12T11:27:38Z

When I started looking at PR #16938 it occurred to me that some of the new structTree-methods are synchronously accessing certain dictionary-data (not used during "normal" structTree-parsing), which may not be generally safe since everything in a dictionary could be a reference (and the relevant data may not have been loaded yet).

Rather than suggesting that we make all those new methods even more asynchronous, to me the overall simplest and safest solution is to ensure that the entire PDF document has been loaded before we begin saving it. In practice this shouldn't really affect "performance" of saving noticeably, since it's always depended on the entire PDF document being downloaded.

Finally note that with the exception of the PDF document possibly not having been fully downloaded when saving is triggered, all other "global" document properties are pretty much guaranteed to already be available at this point.

…g it When I started looking at PR 16938 it occurred to me that some of the new structTree-methods are synchronously accessing certain dictionary-data (not used during "normal" structTree-parsing), which may not be generally safe since everything in a dictionary could be a reference (and the relevant data may not have been loaded yet). Rather than suggesting that we make all those new methods even more asynchronous, to me the overall simplest and safest solution is to ensure that the *entire* PDF document has been loaded *before* we begin saving it. In practice this shouldn't really affect "performance" of saving noticeably, since it's always depended on the entire PDF document being downloaded. Finally note that with the exception of the PDF document possibly not having been fully downloaded when saving is triggered, all other "global" document properties are pretty much guaranteed to already be available at this point.

Snuffleupagus · 2023-09-12T14:39:24Z

/botio test

moz-tools-bot · 2023-09-12T14:39:26Z

From: Bot.io (Windows)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/2a9d02ab30059d0/output.txt

moz-tools-bot · 2023-09-12T14:39:26Z

From: Bot.io (Linux m4)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/63a87d858f2dbce/output.txt

moz-tools-bot · 2023-09-12T14:57:23Z

From: Bot.io (Windows)

Failed

Full output at http://54.193.163.58:8877/2a9d02ab30059d0/output.txt

Total script time: 17.93 mins

Font tests: FAILED
Unit tests: Passed
Integration Tests: FAILED
Regression tests: FAILED

Image differences available at: http://54.193.163.58:8877/2a9d02ab30059d0/reftest-analyzer.html#web=eq.log

moz-tools-bot · 2023-09-12T15:07:14Z

From: Bot.io (Linux m4)

Failed

Full output at http://54.241.84.105:8877/63a87d858f2dbce/output.txt

Total script time: 27.78 mins

Font tests: Passed
Unit tests: Passed
Integration Tests: FAILED
Regression tests: FAILED

  different ref/snapshot: 17
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/63a87d858f2dbce/reftest-analyzer.html#web=eq.log

calixteman

LGTM. Thank you.

Snuffleupagus added the core label Sep 12, 2023

Snuffleupagus requested a review from calixteman September 12, 2023 11:27

calixteman approved these changes Sep 12, 2023

View reviewed changes

Snuffleupagus merged commit b157822 into mozilla:master Sep 12, 2023
3 checks passed

Snuffleupagus deleted the SaveDocument-await-requestLoadedStream branch September 12, 2023 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure that the entire PDF document is loaded before we begin saving it #16941

Ensure that the entire PDF document is loaded before we begin saving it #16941

Snuffleupagus commented Sep 12, 2023

Snuffleupagus commented Sep 12, 2023

moz-tools-bot commented Sep 12, 2023

moz-tools-bot commented Sep 12, 2023

moz-tools-bot commented Sep 12, 2023

moz-tools-bot commented Sep 12, 2023

calixteman left a comment

Ensure that the entire PDF document is loaded *before* we begin saving it #16941

Ensure that the entire PDF document is loaded *before* we begin saving it #16941

Conversation

Snuffleupagus commented Sep 12, 2023

Snuffleupagus commented Sep 12, 2023

moz-tools-bot commented Sep 12, 2023

From: Bot.io (Windows)

Received

moz-tools-bot commented Sep 12, 2023

From: Bot.io (Linux m4)

Received

moz-tools-bot commented Sep 12, 2023

From: Bot.io (Windows)

Failed

moz-tools-bot commented Sep 12, 2023

From: Bot.io (Linux m4)

Failed

calixteman left a comment

Choose a reason for hiding this comment

Ensure that the entire PDF document is loaded before we begin saving it #16941

Ensure that the entire PDF document is loaded before we begin saving it #16941