Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

created 200 scoping set for runaway climate #18

Open
petermr opened this issue Sep 20, 2019 · 2 comments
Open

created 200 scoping set for runaway climate #18

petermr opened this issue Sep 20, 2019 · 2 comments

Comments

@petermr
Copy link
Owner

petermr commented Sep 20, 2019

A quick search to see how many papers relate to runaway or tipping .

MacBook-Pro-3:climate pm286$ getpapers -q "((climate change) AND ((runaway) OR (feedback) OR (tipping)))" -k 500 -x -o runaway500
info: Searching using eupmc API
info: Found 9650 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.1 reported by api
info: Limiting to 500 hits
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Duplicate records found: 998 unique results identified
info: limiting hits
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Individual EUPMC result metadata records written
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt
info: Got XML URLs for 500 out of 500 results
info: Downloading fulltext XML files
Downloading files [==============----------------] 46% (232/500) [77.8s elapsed, eta 89.9]^C 

stopped after 40 %, got 222
ami-search

MacBook-Pro-3:climate pm286$ ami-search -p runaway222/ --dictionary compound species country funders 

Generic values (AMISearchTool)
================================
-v to see generic values
oldstyle            true

Specific values (AMISearchTool)
================================
oldstyle             true
strip numbers        false
wordCountRange       (20,1000000)
wordLengthRange      (1,20)

dictionaryList       [compound, species, country, funders]
dictionaryTop        null
dictionarySuffix     [xml]

0    [main] DEBUG org.contentmine.ami.tools.AbstractAMISearchTool  - old style search command); change
cProject: runaway222
legacy cmd> word(frequencies)xpath:@count>20~w.stopwords:pmcstop.txt_stopwords.txt
legacy cmd> search(compound)
legacy cmd> species(binomial)
legacy cmd> search(country)
legacy cmd> search(funders)
!PMC5264177 .!PMC5299408 !PMC5459990 !PMC5472773 !PMC5551099 !PMC5577139 !PMC5578963 !PMC5593823 !PMC5595922 !PMC5651905 !PMC5678106 .!PMC5719437 !PMC5734744 !PMC5770443 PMC5789925 !PMC5795745 !PMC5798756 !PMC5820313 ...
PMC6536552 PMC6538627 !PMC6539176 .PMC6539203 !PMC6540656 PMC6540663 !PMC6541288 PMC6541573 !PMC6541581 PMC6541717 !PMC6542552 !PMC6542844 !PMC6543642 .PMC6544233 PMC6545051 PMC6545231 UNKNOWN nlm tag: city
UNKNOWN nlm tag: city
!PMC6547168 !PMC6549952 PMC6550257 PMC6553685 !PMC6555712 PMC6556101 UNKNOWN nlm tag: city
UNKNOWN nlm tag: city
UNKNOWN nlm tag: version
UNKNOWN nlm tag: version
UNKNOWN nlm tag: version
!PMC6556939 .PMC6558283 !PMC6559081 !PMC6559268 !PMC6559292 !PMC6561295 !PMC6562896 !PMC6563524 PMC6565653 !PMC6566821 PMC6566967 .PMC65679
...
PMC6723259 PMC6724111 !PMC6724177 !PMC6724306 PMC6724339 !PMC6726645 !PMC6727426 PMC5264177 97035 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
.PMC5299408 97036 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
(PMR see to be a lot of these)
PMC5459990 97036 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
...

.PMC6706196 PMC6706372 PMC6706434 PMC6708170 PMC6708426 PMC6709546 105060 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6709957 105060 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6710573 105060 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6711539 PMC6712833 105149 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
.PMC6712961 105149 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6714084 105149 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6714099 PMC6716414 PMC6716840 PMC6717165 PMC6717645 105225 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6718425 105225 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6718993 PMC6720849 .PMC6721090 PMC6721118 105351 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6723259 PMC6724111 PMC6724177 105406 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6724306 105406 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6724339 PMC6726645 105438 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
PMC6727426 105438 [main] DEBUG org.contentmine.ami.plugins.word.WordCollectionFactory  - no words found to extract
....................................................................................................cannot run command: search([compound])[]; cannot process argument: --sr.search (RuntimeException: cannot read inputStream for dictionary: /org/contentmine/ami/plugins/dictionary/compound.xml)
SP: runaway222..................................................................................................................................................................................................................................................................................................................................................................................................................................................................
create data tables
rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrMacBook-Pro-3:climate pm286$ 

SECTIONS

new tool to find sections. Value depends on publisher consistency

MacBook-Pro-3:climate pm286$ ami-section -p runaway222/ --sections ALL

Generic values (AMISectionTool)
================================
-v to see generic values
oldstyle            true

Specific values (AMISectionTool)
================================
sectionList             [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]
write                   true

AMISectionTool cTree: PMC5264177
AMISectionTool cTree: PMC5299408
AMISectionTool cTree: PMC5459990
AMISectionTool cTree: PMC5472773
...

creates a section/ dir for each CTree

This is new ...
title of section depends on the subtitles from the publisher.

Comments useful.

@petermr
Copy link
Owner Author

petermr commented Sep 22, 2019

created sections for first CTrees. Not all have sections.
messy.

@petermr
Copy link
Owner Author

petermr commented Sep 27, 2019

ran ami-search with dictionary/climate.xml and commiited results under runaway222/

NOTE: maybe save this for analysis otherwise will be overwritten in next commit.
need versioning for multiple tweaked searches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant