-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downloading large files using range requests is slow #5123
Comments
We have had some discussion about this before and #4739 was agreed to be a good solution. However, feel free to propose a patch for this so the developers can try both options. |
I've done some more investigation of this, and it looks like what's happening is that requestChunks in ChunkedStreamManager is being called only for single chunks (by requestRange, which in turn seems to be being called by walk()), which means the grouping code in there is inoperative - it can only group chunk requests together if they come in at the same time. I'm not sure if that's expected behavior for this? I've played around with this and increasing the chunk size certainly does decrease loading time - the payoff is increased time to rendering the first page / other pages as scrolling during the load. The reduced number of requests also seems to improve general browser performance - I guess it doesn't really like dealing with the hundreds of simultaneous range requests. So I guess this is probably not the best solution - I've increased the RANGE_CHUNK_SIZE to 512KB for our build at the moment, but if it's possible to keep the chunk size small while removing the performance issue that would be better. |
Is it still a problem? |
Yeah, this is still an issue as far as I know. I'll retest with the latest
|
@jordan-thoms I think that the question was related to the recently landed PR #5263, which should make PDF.js less reliant on range requests. |
Closing as fixed for now. If the problem remains, please let us know or open a new issue. |
When loading files from a server which supports range requests, PDF.js seems to load the entire file in ~=64kb chunks, using parallel connections to the server.
However this is often very slow - I thinksince the small 64K requests incur a overhead, and making many of them at one time overloads the connection.
For example loading http://mozilla.github.io/pdf.js/web/viewer.html?file=https://d2tkmshiozsr4v.cloudfront.net/documents/files/000/041/350/original/ad41798557504b40d26e11b4f059c0945bb5ca14/750614main_NASA_FY_2014_Budget_Estimates-508.pdf?1401953041 is quite slow - and downloading that file in a single thread with wget is faster.
I would propose that the chunk size of loading is scaled depending on the size of the file - for a 32mb file like the previous one, 1-2mb would probably be much more reasonable. I'm considering submitting a PR for this, would it be of interest?
The text was updated successfully, but these errors were encountered: