Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMP: Import of PDF with XMP data does not show XMP #338

Closed
koppor opened this issue Apr 12, 2018 · 6 comments
Closed

XMP: Import of PDF with XMP data does not show XMP #338

koppor opened this issue Apr 12, 2018 · 6 comments
Labels
component: xmp Issues concerning the XMP PDF metadata

Comments

@koppor
Copy link
Member

koppor commented Apr 12, 2018

Test PDF: https://github.com/JabRef/jabref/blob/feature/xmp-provement/src/test/resources/pdfs/KoppAZ-MADR-ZEUS-2018/KoppAZ-MADR-ZEUS-2018.pdf

Drag'n'drop on Maintable results in following dialog:

grafik

In case XMP data is found, it reads as follows:

grafik

@koppor koppor added the component: xmp Issues concerning the XMP PDF metadata label Apr 12, 2018
@johannes-manner
Copy link

johannes-manner commented Apr 13, 2018

The problem is, that the xmp metadata section is faulty:
image
Line 9,10 and line 64,65: You have to delete the lines 9 and 65 and the xmp import works fine :)

How has this happend, that the ?xpacket is twice in the metadata section? The org.apache.xmpbox.xml.DomXmpParser throws an exception "xmp should end after xpacket end processing instruction". Maybe you can support me with more context :)

@koppor
Copy link
Member Author

koppor commented Apr 13, 2018

I sent it through the validator - and it is OK:

grafik

OK, I See. Can JabRef at the XMP export remove

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>

and

<?xpacket end="w"?>

?

The result is now:

@inproceedings{KoppEtAl-MADR2018,
  author = {Oliver Kopp and Anita Armbruster and Olaf Zimmermann},
  booktitle = {Proceedings of the 10th Central European Workshop on Services and their Composition ({ZEUS} 2018)},
  keywords = {ADR, MADR, architecture decision records, architectural decision records, Nygard},
  month = {#jan#},
  pages = {55--62},
  publisher = {CEUR-WS.org},
  series = {{CEUR} Workshop Proceedings},
  title = {Markdown Architectural Decision Records: Format and Tool Support},
  url = {https://adr.github.io/madr/},
  volume = {2072},
  year = {2018}
}

So, it basically looks great, but the month should be displayed as month = jan. I do not know where this comes from.

koppor added a commit to JabRef/jabref that referenced this issue Apr 13, 2018
@johannes-manner
Copy link

johannes-manner commented Apr 18, 2018

Is it necessary to remove the duplicated begin-tags?

If I write the xmp data with JabRef, there is no duplicated entries. Otherwise I have to modify the InputStream before the dom xmp parsing happens, because I have no control over the org.apache.xmpbox.xml.DomXmpParser.DomXmpParser.

A workaround would be to capture the exception and return a empty bib entry with a comment, that the xmp is not valid or another error message?

I will check the month problem!

@koppor
Copy link
Member Author

koppor commented Apr 18, 2018 via email

@johannes-manner
Copy link

Okay, now I understand the problem. The export xmp should not contain the begin-tag, as mentioned in the xmpincl package.

I reproduced your problem now. Normally, we don't need this begin-xml-processing instruction!

@koppor
Copy link
Member Author

koppor commented Apr 24, 2018

Fixed by JabRef#3964

@koppor koppor closed this as completed Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: xmp Issues concerning the XMP PDF metadata
Projects
None yet
Development

No branches or pull requests

2 participants