-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the XRef.cache
Array with a Map instead
#11076
Replace the XRef.cache
Array with a Map instead
#11076
Conversation
The relevant methods are usually not hot enough for these changes to have an easily measurable effect, however there's been a lot of other cases where similiar inlining has helped performance. (And these changes may help offset the changes made in the next patch.)
Given that the different types of `Stream`s will never be cached, this thus implies that the `XRef.cache` Array will *always* be more-or-less sparse. Generally speaking, the longer the document the more sparse the `XRef.cache` will thus become. For example, looking at the `pdf.pdf` file from the test-suite: The length of the `XRef.cache` Array will be a few hundred thousand elements, with approximately 95% of them being empty. Hence it seems pretty clear that an Array isn't really the best data-structure for this kind of cache, and this patch thus changes it to a Map instead. This patch-series was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file: ``` [ { "id": "issue2618", "file": "../web/pdfs/issue2618.pdf", "md5": "", "rounds": 200, "type": "eq" } ] ``` which gave the following results when comparing this patch-series against the `master` branch: ``` -- Grouped By browser, stat -- browser | stat | Count | Baseline(ms) | Current(ms) | +/- | % | Result(P<.05) ------- | ------------ | ----- | ------------ | ----------- | --- | ----- | ------------- Firefox | Overall | 200 | 2736 | 2736 | 1 | 0.02 | Firefox | Page Request | 200 | 2 | 2 | 0 | -8.26 | faster Firefox | Rendering | 200 | 2733 | 2734 | 1 | 0.03 | ```
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/d177e255d8be3ae/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/7f47b0cddab23cb/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/7f47b0cddab23cb/output.txt Total script time: 17.43 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/d177e255d8be3ae/output.txt Total script time: 26.23 mins
|
That's a really good find. Thank you! |
Given that the different types of
Stream
s will never be cached, this thus implies that theXRef.cache
Array will always be more-or-less sparse.Generally speaking, the longer the document the more sparse the
XRef.cache
will thus become. For example, looking at thepdf.pdf
file from the test-suite: The length of theXRef.cache
Array will be a few hundred thousand elements, with approximately 95% of them being empty.Hence it seems pretty clear that an Array isn't really the best data-structure for this kind of cache, and this patch thus changes it to a Map instead.
This patch-series was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
which gave the following results when comparing this patch-series against the
master
branch: