Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up OpenRefine reconciliation endpoint #65

Closed
acka47 opened this issue Apr 10, 2018 · 9 comments
Closed

Set up OpenRefine reconciliation endpoint #65

acka47 opened this issue Apr 10, 2018 · 9 comments

Comments

@acka47
Copy link
Contributor

acka47 commented Apr 10, 2018

https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API

See also hbz/lobid-organisations#55 & hbz/lobid-organisations#385.

@fsteeg
Copy link
Member

fsteeg commented Jun 25, 2018

Deployed to test:

http://test.lobid.org/gnd/reconcile

http://test.lobid.org/gnd/api#openrefine

curl --data 'queries={"q1":{"query":"Twain, Mark"}}' http://test.lobid.org/gnd/reconcile

@fsteeg fsteeg assigned acka47 and unassigned fsteeg Jun 25, 2018
@acka47
Copy link
Contributor Author

acka47 commented Jun 26, 2018

Shouldn't it say AuthorityResource (or an array with all the entity types listed) in the defaultTypes field? E.g. like this:

{
  "defaultTypes":[
    {
      "id":"AuthorityResource",
      "name":"authority resource"
    }
  ]
}

@fsteeg
Copy link
Member

fsteeg commented Jun 26, 2018

Shouldn't it say AuthorityResource (or an array with all the entity types listed) [...]

Right, if we actually support multiple types that would make sense. The types allow restricting the reconciliation in the OpenRefine UI. So with a single default type the name makes no difference (in lobid-organisations it's 'lobid-organisation', thus 'lobid-gnd' here). Restricting by top-level type would probably be a useful feature, both here and in lobid-organisations.

I suggest we add the funtionality in lobid-gnd in this issue, and open a new issue in lobid-organisations.

@acka47
Copy link
Contributor Author

acka47 commented Jun 26, 2018

We have the phenomenon again that OpenRefine not automatically picks a match which is quite comfortable for Open Refine users. Querying both the Wikidata endpoint as well as the OER World Map endpoint with an identical string to be found there, automatically matches the item but it doesn't for lobid-gnd. I think this has to do with the score we give back, we would have to give back a score of 1.0 or 100.0 for an automatic match. Comparing lobid-gnd with the Wikidata API:

$ curl --data 'queries={"q1":{"query" : "tim berners-lee", "limit" : 3} }' http://test.lobid.org/gnd/reconcile
{"q1":{"result":[{"id":"121649091","name":"Berners-Lee, Tim","score":47.57019,"match":false,"type":["lobid-gnd"]},{"id":"137213085","name":"Lee, Tim","score":30.181828,"match":false,"type":["lobid-gnd"]},{"id":"1012381277","name":"Lee, Tim","score":30.093718,"match":false,"type":["lobid-gnd"]}]}}
 $ curl --data 'queries={"q1":{"query" : "Tim Berners-Lee", "limit" : 3} }' https://tools.wmflabs.org/openrefine-wikidata/de/api
{"q1": {"result": [{"all_labels": {"weighted": 100.0, "score": 100}, "type": [{"name": "Mensch", "id": "Q5"}], "name": "Tim Berners-Lee", "match": true, "score": 100.0, "id": "Q80"}, {"all_labels": {"weighted": 70.0, "score": 70}, "type": [{"name": "TED-Vortrag", "id": "Q23011722"}], "name": "Tim Berners-Lee \u00fcber das n\u00e4chste Web", "match": false, "score": 70.0, "id": "Q22980417"}, {"all_labels": {"weighted": 54.0, "score": 54}, "type": [{"name": "TED-Vortrag", "id": "Q23011722"}], "name": "Tim Berners-Lee: Eine Magna Carta f\u00fcr das Internet", "match": false, "score": 54.0, "id": "Q22991023"}]}}

@acka47
Copy link
Contributor Author

acka47 commented Jun 26, 2018

As it seems to be very simple, we should also add a suggest API preview API.

@acka47
Copy link
Contributor Author

acka47 commented Jun 26, 2018

Re. preview API, it might make sense to just deliver the default suggestion string plus a small image if available, similar to wikidata, see e.g. https://tools.wmflabs.org/openrefine-wikidata/en/preview?id=Q42. Providing this as small HTML snippet makes this a bit more complicated, so we might skip this for now.

@acka47
Copy link
Contributor Author

acka47 commented Jun 26, 2018

We have the phenomenon again that OpenRefine not automatically picks a match which is quite comfortable for Open Refine users.

Strange, testing this again with the following data yielded four automatic matches:

id	label
01	Ford Taurus
02	Adrian Pohl
03	Niederrhein-Gebiet
04	Puerto Rico. Water Resources Authority
05	Twain, Mark

@fsteeg
Copy link
Member

fsteeg commented Jun 27, 2018

Deployed restriction by type, see: http://test.lobid.org/gnd/reconcile

About the preview API: it should be quite straightforward to implement, but we're not sure if or where it is used in the OpenRefine UI (I tested with OpenRefine 2.7 and both Wikidata and a local preview implementation for lobid-gnd, @acka47 tested with OpenRefine 3.0 beta and Wikidata reconciliation).

@acka47
Copy link
Contributor Author

acka47 commented Jun 28, 2018

+1

@acka47 acka47 removed their assignment Jun 28, 2018
@dr0i dr0i added deploy and removed review labels Jun 29, 2018
@fsteeg fsteeg closed this as completed in e36886a Jun 29, 2018
@dr0i dr0i removed the deploy label Jun 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants