Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Funding Information #652

Closed
de-code opened this issue Oct 15, 2020 · 1 comment · Fixed by #959
Closed

Feature Request: Funding Information #652

de-code opened this issue Oct 15, 2020 · 1 comment · Fixed by #959
Assignees

Comments

@de-code
Copy link
Collaborator

de-code commented Oct 15, 2020

It would be good to be able to extract funding information.

Unfortunately, the bioRxiv XML itself doesn't contain specific annotations.
i.e. currently they are just mostly back sections.
We would like to extract also things like RRIDs.

Funding related information seem to be mostly contained near the acknowledgements, e.g.:

183988v1 (10.1101/183988)

<ack>
    <title>ACKNOWLEDGEMENTS</title>
    <p>We would like to thank Dr. Maria Bouvy-Liivrand for help with establishing the ST2 cell culture and differentiation and EMBL Gene Core at Heidelberg for support with high-throughput sequencing. The experiments presented in this paper were carried out using the HPC facilities of the University of Luxembourg (<xref ref-type="bibr" rid="c73">Varrette et al., 2014</xref>).</p>
</ack>
<sec id="s5" sec-type="COI-statement">
    <title>COMPETING INTERESTS</title>
    <p>The authors declare they have no competing interests.</p>
</sec>
<sec id="s6">
    <title>FUNDING</title>
    <p>This work was supported by funding from the University of Luxembourg. DG was supported by fellowship from the Luxembourg National Research Fund (FNR) (AFR 7924045).</p>
</sec>
<sec id="s7" sec-type="con">
    <title>AUTHOR CONTRIBUTIONS</title>
    <p>DG, TS and LS conceived the project and designed the experiments and analysis. DG performed all the experiments and DG and LS analyzed the results. DG and AG performed the RNA-seq and ChIP-seq analysis. FS and MHS developed the EPIC-DREM approach. DG, FS and MHS performed the EPIC-DREM analysis. MS performed the Western blot analysis. RH prepared the libraries and performed the sequencing for the AHR-KD experiments. PE developed the randomization method to derive control footprint regions. All authors commented on the manuscript.</p>
</sec>
GROBID 0.6.1 XML
<div type="acknowledgement">
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>ACKNOWLEDGEMENTS</head>
        <p>We would like to thank Dr. Maria Bouvy-Liivrand for help with establishing the ST2 cell culture and differentiation and EMBL Gene Core at Heidelberg for support with high-throughput sequencing. The experiments presented in this paper were carried out using the HPC facilities of the University of Luxembourg 
            <ref type="bibr" target="#b84">(Varrette et al., 2014)</ref>.
        </p>
    </div>
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>COMPETING INTERESTS</head>
        <p>The authors declare they have no competing interests.</p>
    </div>
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>FUNDING</head>
        <p>This work was supported by funding from the University of Luxembourg. DG was supported by fellowship from the Luxembourg National Research Fund (FNR) (AFR 7924045).</p>
    </div>
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>AUTHOR CONTRIBUTIONS</head>
        <p>DG, TS and LS conceived the project and designed the experiments and analysis. DG performed all the experiments and DG and LS analyzed the results. DG and AG</p>
    </div>
</div>

404632v1 (10.1101/404632)

<ack>
    <title>ACKNOWLEDGEMENTS</title>
    <p>The authors would like to acknowledge the staff of State Central Laboratories (LACENs) and the technical assistance given by the laboratory personnel at the Se&#x00E7;&#x00E3;o de Virologia of the Instituto Evandro Chagas. The authors are also thankful to the children/mothers who agreed to participate in this study as volunteers and permitted the analysis of their relevant biological material.</p>
</ack>
<sec id="s5">
    <title>AUTHOR&#x2019;S CONTRIBUTION</title>
    <p>ABFL, KCP, JFC: performed the experiments. PSL, SFSG, DAMB, RSB: carried out molecular and the phylogenetic analyses. LSS, ABFL: analysed, interpreted data and drafted the manuscript. LSS, JDPM: provided critical review of manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec id="s6">
    <title>FINANCIAL SUPPORT</title>
    <p>This research received no specific grant from any funding agency, commercial or not-for-profit sectors.</p>
</sec>
<sec id="s7">
    <title>CONFLICT OF INTEREST</title>
    <p>None.</p>
</sec>
GROBID 0.6.1 XML
<div type="acknowledgement">
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <p>All authors read and approved the final manuscript.</p>
    </div>
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>FINANCIAL SUPPORT</head>
        <p>This research received no specific grant from any funding agency, commercial or not-for-profit sectors.</p>
    </div>
</div>

462929v1 (10.1101/462929)

<ack>
    <title>Acknowledgment</title>
    <p>We kindly thank Dr. Daniel Stange and Dr. Sch&#x00F6;lch for providing the CRC-like organoids and to the whole Department of Gastrointestinal, Thoracic and Vascular Surgery at University Hospital Carl Gustav Carus for its continuous support. The research stay on this lab was kindly financed with the assistance of a Boehringer Ingelheim Fonds travel grant.</p>
</ack>
<sec sec-type="COI-statement">
    <title>Conflict of interest statement</title>
    <p>Authors declare no potential conflict of interest.</p>
</sec>
<sec>
    <title>Source of funding</title>
    <p>This work was supported by Ministerio de Econom&#x00ED;a y Competitividad del Gobierno de Espa&#x00F1;a (MINECO, Plan Nacional I&#x002B;D&#x002B;i AGL2016-76736-C3), Gobierno regional de la Comunidad de Madrid (P2013/ABI-2728, ALIBIRD-CM) and EU Structural Funds.</p>
</sec>
GROBID 0.6.1 XML
<div type="acknowledgement">
    <div
        xmlns="http://www.tei-c.org/ns/1.0">
        <head>Acknowledgment</head>
    </div>
</div>

GROBID seems to generally group it under acknowledgements, as separate sub-sections. For the third example it failed to extract the text though.

Neither the bioRxiv XML nor GROBID seem to have specific annotations for funding.

Although the GROBID training data for the header model does contain examples for funding, e.g. 12._10.1.1.56.103.training.header.tei.xml:

<note type="funding">This work was partially supported by NSA Grant MDA904-96-1-0111. 1</note>
@kermitt2
Copy link
Owner

Funding is indeed a frequent request!

As you indicate, the new training data for header has been labelled with <note type="funding"> and I think there's enough data to have quite good results, it's parsed and stored but it's simply not outputted yet in the final TEI... this one is really straightforward to add.

When the funding information is not in the header, the "funding" declaration is labelled for the moment in the segmentation model as an independent <div type="annex">. So, it would need some effort to add it, but there's a good base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants