Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing geographical data for records with grant year 2011-2013 #4

Open
AVermeij opened this issue Feb 16, 2013 · 1 comment
Open

Comments

@AVermeij
Copy link

For all records with grant year 2011 or 2012, the colums Street, City, State, Country, Zipcode, Longitude and Latitude contain no data at all. This holds for both the Full 2012 and the January 2013 disambiguations. Interestingly, about half of the records with grant year 2013 do contain this data. I checked whether this missing data had an effect on the resulting disambiguations for the respective inventors, but this doesn't seem to be true - regardless of the missing data, the inventors are still properly disambiguated.

First picture attached shows a simple pivot table showing that about half of the 2013 records miss geographical data (country taken as an example); the second picture shows some examples of missing 2011 data.

country
missing_data

@doolin
Copy link
Member

doolin commented Feb 16, 2013

This is a known issue, and here is what we know.

The 2010 and earlier data is merged from previously disambiguated data also posted on DVN.

We introduced a bug when the NGA location schema changed, and didn't catch it until late last fall.

The complete 2012 disambiguation used 2011 and 2012 parses incorporating the totally broken location data.

The 2013 parse shows a partial fix to locations is now being incorporated.

We're working pretty hard to fix the location/geocoding functionality right now. Once we have it, we'll update the 2011 and 2012 parses to reincorporate locations.

Street addresses are a bonus, they are not often reported. We have 15% of them database wide.

Here's the current stats: http://funginstitute.github.com/statistics/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants