-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Browser caching does not work for the first range request on Chrome and Safari #11624
Comments
Well, if you set
It seems that, again, your problem is related to the differences in how various browsers have chosen to implement caching of network requests. All-in-all, I'm afraid that it does not seem to be much (if anything) actionable here from a PDF.js perspective. |
Yes this is the expected behavior of disableStream in this scenario. I am wondering if this is possible to replace the first request by another type of request.
Yes |
First of all, you need to dispatch a regular (200) request to check if range requests are actually supported and correctly implemented by the server, see pdf.js/src/display/fetch_stream.js Lines 137 to 147 in dd893d5
Lines 320 to 332 in dd893d5
pdf.js/src/display/network_utils.js Lines 23 to 62 in dd893d5
Second of all, in the default case where streaming is used the initial (200) request is simply allowed to continue. Not only is that the most efficient solution, but it also avoids errors with servers that don't allow a PDF document to be fetched more than once. |
I found that the following is not mandatory on my computer. I skip this part with an if: pdf.js/src/display/fetch_stream.js Lines 119 to 159 in dd893d5
And an else condition |
I don't really understand what #11624 (comment) even suggests removing/skipping, but it's obviously not appropriate to simply remove a Furthermore, as was clearly mentioned in #11624 (comment) and is hopefully evident if you carefully read through the
It's hopefully easy to see why a single example doesn't really mean anything in general, and why you cannot assume that random servers on the internet are always correctly configured.
It's not really clear what you're suggesting here, but it's obviously not a good idea to purposely add options that would allow users to essentially "shot themselves in the foot" by disabling validation which is necessary for the correct function of the PDF.js library. All-in-all, I'm really not seeing anything that can be fixed here (from a PDF.js perspective) and would thus suggest that the issue should be closed. |
@Snuffleupagus I totally agree with you. I am looking for a specific solution for a specific server setting to cache this preflight 200 and be able to load it offline! |
Given that caching of range requests are not well supported across browsers, have you instead considered not depending so heavily on that and simply load the PDF document with default values for the relevant |
@Snuffleupagus In fact, I have successfully tested offline with #11624 (comment) for the following document and range request activated: https://public.fays.io/public/2ce2766d-b491-43a0-a0a9-96c56c2bd667.pdf 💯 I try with the PDF.JS integrated viewer. When one turns off Wifi and refreshes the tab 10 times it still loads the document and one may consult the previously requested pages. 🥇 Still, I did not yet manage to make it works when one shutdowns the browser. 👎 One may try to use a production environment and see what happens with the ServiceWorker. Of course, as you mentionned, this is not a general solution at all. Just a try. This feature can not be defaulted for every PDF.JS users. But I find out that the first request (code 200) is not mandatory because the preflight CORS check is also performed by the browser if the request has a "range" header (for all browser, this is in the specs, who knows ?). As you said:
The first request is mandatory when one does not know if the server accept ranges and to get the size of the document, this is specified on Still, the first request is an option if the user can ensure by itself that But all of this requires more investigation 🔢 💯 Therefore, it will be nice to have a way to override this part easily outside PDF.JS but I don't know how to do it properly. Can one extends PDFFetchStreamReader ? 💯
I confirm that offline already works when disableStream and disableAutoFetch are not set. 👍
In fact, it works well for range requests on Safari, Chrome, Opera (including mobile?), I have not tested it on Edge or any alternatives and they plan to implement the feature on Firefox. |
You want to trust the server, rather than the user here[1], to avoid a support nightmare later on. Besides, it's really not at all correct to make the changes you're suggesting since it'd make
No, that's considered internal functionality and it's consequently not exposed in the public API. (You'd need to fork the code, but as I've tried to explain multiple times the changes you're suggesting aren't really a good idea.) The "correct" way of handling special cases, with regards to fetching of data, would be to utilize This issue really ought to remain closed. /cc @timvandermeij [1] Previous experiences have shown that users may, more often than you'd like, set certain options without always fully understanding the ramifications of doing so. |
Thank you for this @Snuffleupagus |
Yes, it's part of the public API; see also Line 177 in 1240048
Closing as answered since there is not really anything we can/should do here. |
Dear community,
When one activates range requests with
disableAutoFetch: true
anddisableStream: true
and http server configured to cache contents withpublic, max-age=31536000, immutable,
browser caching does work on Chrome, nor on Safari.But the first range request return a code 200 and is not cached by the browser, until all the ranges has been download.
Attach (recommended) or Link to PDF file here: not file specific, but one can use https://public.fays.io/public/580af700-5f79-44c4-8a0c-f7e976ee528f.pdf
The server configuration is:
Configuration:
Steps to reproduce the problem:
What is the expected behavior?
The expected behavior is that the browser caches all the HTTP requests, including the first one with response code 200.
What went wrong?
After step 3., one can see in the developer tool that the range requests have been cached but not the first request.
The first request has specific header for CORS:
The request is not cached by the browser since PDF.JS cancels the first request.
pdf.js/src/display/fetch_stream.js
Lines 155 to 157 in 93aa613
**Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): **
https://app.fays.io/bf205bf6-43d2-4140-8140-ab4b6f48c2a5
Best, A.
The text was updated successfully, but these errors were encountered: