Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline the isString check in the Parser.getObj method #11070

Merged
merged 2 commits into from
Aug 16, 2019

Conversation

Snuffleupagus
Copy link
Collaborator

For very large and complex PDF files this will help performance slightly, since Parser.getObj is called a lot during parsing in the worker.

This patch was tested using the PDF file from issue #2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:

[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 200,
       "type": "eq"
    }
]

which gave the following results when comparing this patch against the master branch:

-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |   200 |         2847 |        2830 | -17 | -0.60 |        faster
Firefox | Page Request |   200 |            2 |           2 |   0 | -7.14 |
Firefox | Rendering    |   200 |         2844 |        2827 | -17 | -0.60 |        faster

For very large and complex PDF files this will help performance *slightly*, since `Parser.getObj` is called *a lot* during parsing in the worker.

This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 200,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |   200 |         2847 |        2830 | -17 | -0.60 |        faster
Firefox | Page Request |   200 |            2 |           2 |   0 | -7.14 |
Firefox | Rendering    |   200 |         2844 |        2827 | -17 | -0.60 |        faster
```
@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/0e8274554a029dc/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/f0f2f0f48f0ce8a/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/0e8274554a029dc/output.txt

Total script time: 17.50 mins

  • Font tests: Passed
  • Unit tests: FAILED
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/f0f2f0f48f0ce8a/output.txt

Total script time: 25.98 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

This avoids allocating approximately 1.7 million short-lived variables when loading the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, in the default viewer.
@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/6bd962a1c36fcf7/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/49c9192de34ae57/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/49c9192de34ae57/output.txt

Total script time: 17.46 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/6bd962a1c36fcf7/output.txt

Total script time: 26.15 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij timvandermeij merged commit 20181b6 into mozilla:master Aug 16, 2019
@timvandermeij
Copy link
Contributor

Looks good!

@Snuffleupagus Snuffleupagus deleted the Parser-getObj-rm-isString branch August 16, 2019 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants