Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent errors, because of incorrect scope, in the XMLParserBase._resolveEntities method (issue 10407) #10408

Merged
merged 1 commit into from
Jan 4, 2019

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Jan 3, 2019

Now with a unit-test, courtesy of #10408 (comment).

Fixes #10407.

@timvandermeij
Copy link
Contributor

I looked into this and managed to create a unit test:

+  it('should resolve entities correctly (issue 10407)', function() {
+    const data = '<x:xmpmeta xmlns:x=\'adobe:ns:meta/\'>' +
+      '<rdf:RDF xmlns:rdf=\'http://www.w3.org/1999/02/22-rdf-syntax-ns#\'>' +
+      '<rdf:Description xmlns:dc=\'http://purl.org/dc/elements/1.1/\'>' +
+      '<dc:title><rdf:Alt><rdf:li xml:lang="x-default">&apos;Foo bar baz&apos;</rdf:li>' +
+      '</rdf:Alt></dc:title></rdf:Description></rdf:RDF></x:xmpmeta>';
+    const metadata = new Metadata(data);
+
+    expect(metadata.has('dc:title')).toBeTruthy();
+    expect(metadata.has('dc:qux')).toBeFalsy();
+
+    expect(metadata.get('dc:title')).toEqual('\'Foo bar baz\'');
+    expect(metadata.get('dc:qux')).toEqual(null);
+
+    expect(metadata.getAll()).toEqual({ 'dc:title': '\'Foo bar baz\'', });
+  });

It's equal to the first unit test in the file with the title now containing &apos;, which needs to be resolved. I confirmed that this breaks without your fix and works with your fix, but please check as well and if it works, let's add it in this PR.

@dhollenbeck
Copy link

Sorry about not providing a test pdf on the original issue. However, I would like to provide one now.
issue-10407.pdf

I was able to reproduce a test case PDF file with all of the sensitive data striped out by using the following steps:

  • pdftk ./source.pdf burst
  • qpdf.exe pg_0001.pdf --pages qpdf-manual.pdf 1 -- out.pdf

@timvandermeij
Copy link
Contributor

Thank you for providing this, @dhollenbeck! It will definitely help us to verify the fix with the actual bad metadata.

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Jan 4, 2019

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/aa696ea48a2f450/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jan 4, 2019

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/7b2b43618af4d28/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jan 4, 2019

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/aa696ea48a2f450/output.txt

Total script time: 17.59 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

pdfjsbot commented Jan 4, 2019

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/7b2b43618af4d28/output.txt

Total script time: 23.37 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij timvandermeij merged commit b39ec7a into mozilla:master Jan 4, 2019
@timvandermeij
Copy link
Contributor

Nice work!

@Snuffleupagus Snuffleupagus deleted the issue-10407 branch January 5, 2019 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants