You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using GROBID to convert bioRxiv preprints PDFs to XML and finding that paragraph content with the heading 'Funding' is not being captured in the TEI XML output (version 0.7.0, lightweight Docker image).
The 'Funding' section in this document appears immediately after the 'Acknowledgements' on Page 19 of the PDF. It is captured in the TEI XML as @type="annex", but the <p> element and text content is missing:
Although recognition of this section with some @type="funding" attribute would be the ideal scenario, for my use case I simply need the associated <p> content (i.e. "This work was funded by...") to be output in the TEI XML.
Many thanks in advance for looking into this issue.
The text was updated successfully, but these errors were encountered:
I am using GROBID to convert bioRxiv preprints PDFs to XML and finding that paragraph content with the heading 'Funding' is not being captured in the TEI XML output (version 0.7.0, lightweight Docker image).
Example PDF:
2021.09.27.461862v1.full.pdf
The 'Funding' section in this document appears immediately after the 'Acknowledgements' on Page 19 of the PDF. It is captured in the TEI XML as
@type="annex"
, but the<p>
element and text content is missing:Although recognition of this section with some
@type="funding"
attribute would be the ideal scenario, for my use case I simply need the associated<p>
content (i.e. "This work was funded by...") to be output in the TEI XML.Many thanks in advance for looking into this issue.
The text was updated successfully, but these errors were encountered: