-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify Wikidata reconciliation strategies for GAZ. #43
Comments
I started a repo with some plans in it here: https://github.com/INCATools/environments2wikidata there are so many terms, manual curation will be hard. But we can use ontology axioms to aid in the disambiguating... lots of old code, I will try and update... |
Today I tried to add as many GAZ identifiers to Wikidata on Suriname as possible (see: https://w.wiki/6CVW). This was basically mainly a manual curation step, where I search for the names in Wikidata and added the respective GAZ identifiers. |
For editing the GAZ: To edit: GAZ:$sequence(8,33333333,44444444) Cheers, |
GAZ does seem to have many mappings to external identifiers (if at all). This makes aligning Wikidata particularly challenging.
To get all terms in GAZ covered in Wikidata we would probably need to apply different strategies to see if a term is already is covered or not.
In the case where the label used in Wikidata exactly matches the term in GAZ, Open refine, can be our friend. I used this tool - offered in for example PAWS - to align GAZ countries with Wikidata.
However, I continued with terms on Suriname in GAZ. So far all terms do exist in Wikidata but most with a different spelling variation. I will try to add all GAZ terms for that country, manually.
So so far two strategies have been applied:
The text was updated successfully, but these errors were encountered: