Stop caching Streams in XRef.fetchCompressed
#11370
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm slightly surprised that this hasn't actually caused any (known) bugs, but that may be more luck than anything else since it fortunately doesn't seem common for Streams to be defined inside of an 'ObjStm'.[1]
Note that in the
XRef.fetchUncompressed
method we're not caching Streams, and that for very good reasons too:Streams, especially the
DecodeStream
ones, can become very large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.Attempting to read from the same Stream more than once won't work, unless it's
reset
in between, since using any method such as e.g.getBytes
always starts at the current data position.Given that even the
src/core/
code is now fairly asynchronous, see e.g. thePartialEvaluator
, it's generally impossible to assert that any one Stream isn't being accessed "concurrently" by e.g. differentgetOperatorList
calls. Hencereset
-ing a cached Streams isn't going to work in the general case.All in all, I cannot understand why it'd ever be correct to cache Streams in the
XRef.fetchCompressed
method.[1] One example where that happens is the
issue3115r.pdf
file in the test-suite, where the streams in question are not actually used for anything within the PDF.js code.