-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wdi_helpers.id_mapper does not return the complete map #65
Comments
It turned out it's a problem with the items. Probably a Wikidata issue. That's very strange, though. Sorry for this issue. |
Hmm, weird that you would get different values running it twice. What do you mean it's a problem with the items? I just ran the command a couple time and I get the same value every time (65,255). Did some of these items just get updated recently? I think so, yes, looking at this for example. It may take ~10 min for the SPARQL endpoint to get updated fully. |
There is some wikidata issue going on here. Some items are not properly in the blazegraph. For example run the following query a bunch of times a couple min apart and you get either a value or no result depending on which server it hits:
I assume this is related to a know issue: https://phabricator.wikimedia.org/T112397 |
Hey @stuppie! It's a very strange situation. I know there is a little bit of latency but I think that it happened something odd on the Wikidata side during the bulk update. Or maybe there is something odd in the query service just now. For example: SELECT ?item
WHERE
{
?item wdt:P5114 ?x.
} gets 65,438 results. Instead: SELECT distinct ?item
WHERE
{
?item wdt:P5114 ?x.
} gets 65,437 results. But if i check for double ID: SELECT ?item ?item2
WHERE
{
?item wdt:P5114 ?x.
?item2 wdt:P5114 ?x.
FILTER (?item != ?item2)
} I get 0 results. If I did not make mistakes, it's an impossible situation. Regarding the different outputs of the same query run twice, I do confirm that there was a kind of oscillation among those 2 values. Now it seems stable. What a mess! 😢 Update after having read your last post: Yes, I see. There is something strange going on. I'm sorry, I opened an issue here since I though that it was a problem with WikidataIntegrator, but actually this is not a library problem. |
I do not see variation in query response on the server, but I do see difference between distinct and non-distinct counts. This seems to be because https://www.wikidata.org/wiki/Q3747159 has two IDs. Your query checks for one ID belonging to two items, but in fact it's the other way - one item has two IDs. |
Fixed https://www.wikidata.org/wiki/Q3747159. It seems like there is still a difference between this (65437 items): SELECT (count(?item) as ?c)
WHERE
{
?item wdt:P5114 ?x.
} and this (65282 items): SELECT ?id ?item ?mrt
WHERE
{
?item p:P5114 ?s .
?s ps:P5114 ?id
OPTIONAL {?s pq:P4390 ?mrt}
} |
I'm not sure... Stan says its because of IDs on two items but when I get the same count (65437) whether I run either of these queries:
|
Indeed, seems like something was broken with these items. I've updated them and now they look OK. |
I do not know if it is normal, but the 2 queries (with or without |
Well, after another update by @smalyshev, everything seems working fine : ) |
Hello, I do not know if I misunderstood something but it seems there are some problems on
wdi_helpers.id_mapper
.If I run the following query in Wikidata, I get 65,438 items.
Now let's go with WikidataIntegrator.
As you notice, I get 2 different mappings for 2 identical calls. In both cases, they are different from 65,438.
The difference is due to the
id_mapper
query, that is the following:Instead of mapping the
item
to thevalue
, it maps theitem
to thestatement
and thestatement
to thevalue
. The two patterns should be identical, but actually they are different.I do not know that is a problem and if it's related with this library but it seems an unexpected behavior. Sorry If I missed something.
The text was updated successfully, but these errors were encountered: