-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow over-writing entries, in XRef.indexObjects
, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
#11231
Allow over-writing entries, in XRef.indexObjects
, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
#11231
Conversation
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/e76013aabbe15e1/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/e76013aabbe15e1/output.txt Total script time: 1.71 mins Published |
Temporarily dumping some PDF files here, to have a (slightly) better place to point linked test-cases to. |
041da85
to
998b6b3
Compare
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/e61ebbad97b67e1/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/e61ebbad97b67e1/output.txt Total script time: 1.69 mins Published |
998b6b3
to
d939df7
Compare
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/66a1741ad676488/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/5012ec3eb0199db/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.67.70.0:8877/5012ec3eb0199db/output.txt Total script time: 17.91 mins
Image differences available at: http://54.67.70.0:8877/5012ec3eb0199db/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.215.176.217:8877/66a1741ad676488/output.txt Total script time: 26.31 mins
Image differences available at: http://54.215.176.217:8877/66a1741ad676488/reftest-analyzer.html#web=eq.log |
794c7e9
to
ff4593e
Compare
XRef.indexObject
, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)XRef.indexObjects
, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
ff4593e
to
d7c6a7f
Compare
…eration number matches (issues 11230, 11139, 9552, 9129, 7303) This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1] Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break. The only reason that I'm even submitting this patch, is because of the number of open issues that it would address. Generally speaking though, the best course of action would probably be if `XRef.indexObjects` was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them. --- [1] Especially given that it's reverting part of PR 5910, however in the case of issue 5909 it seems that other (more recent) changes have actually made that PR redundant.
d7c6a7f
to
ffc847e
Compare
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @timvandermeij received. Current queue size: 0 Live output at: http://54.67.70.0:8877/cb83521b5c0723f/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/cb83521b5c0723f/output.txt Total script time: 1.70 mins Published |
I think this is a very nice improvement. Even though you're absolutely right about other breakage that may still occur, given that those PDF files are usually quite broken, we have a lot more test coverage now than we had before, so future changes will be less risky. Thanks! |
/botio makeref |
From: Bot.io (Linux m4)ReceivedCommand cmd_makeref from @timvandermeij received. Current queue size: 0 Live output at: http://54.67.70.0:8877/875b84455911221/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_makeref from @timvandermeij received. Current queue size: 2 Live output at: http://54.215.176.217:8877/9707c85a90f21da/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/875b84455911221/output.txt Total script time: 16.74 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/9707c85a90f21da/output.txt Total script time: 23.93 mins
|
TODO:
4-corrupted-pdf
such that there's no obvious regressions.I've done a quick pass already, but additional help would be very much appreciated here.
load
tests toeq
tests instead, to help avoid unnecessary regressions. Note that some files will render, but not as intended when compared with e.g. Adobe Reader, depending on which objects are chosen (see e.g. issues with incorrect total number of pages).eq
tests as possible, based on the list above.This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1]
Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break. Edit: Obviously any breakage would be limited to corrupt PDF documents, since valid ones does not enter this method at all. Edit2: Also, this patch should mostly affect the "multiple documents in one PDF file" case, but not necessarily PDF documents with just a broken XRef table.
The only reason that I'm even submitting this patch, is because of the number of open issues that it would address.
Generally speaking though, the best course of action would probably be if
XRef.indexObjects
was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them.Fixes #11230
Fixes #11139
Fixes #9552
Fixes #9129
Fixes #7303
Also partially improves #6243, however parts of it regresses as well. (A couple of pages now renders "better", but one page no longer renders and an image looks worse.)
Generally though it seems that none of the PDF viewers I've tried agree completely on how the document should look, and this might be a case where the document is just too damaged to "fix".
[1] Especially given that it's reverting part of PR #5910, however in the case of issue #5909 it seems that other (more recent) changes have actually made that PR redundant.