-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query standoff markup in Gravsearch #910
Conversation
… Sipi. - Fix copyright notice.
@tobiasschweizer I'm having a problem with the processing of the results of the main query. I want to write a query that finds the letter This query works: PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>
CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:creationDate ?date .
FILTER(knora-api:toSimpleDate(?date) = "GREGORIAN:1756-01-03 CE"^^knora-api-simple:Date)
?letter beol:letterHasLanguage ?language .
?language knora-api:valueAsString "German" .
?letter beol:hasText ?text .
?letter knora-api:hasStandoffLinkTo ?person .
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
} But this one returns PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>
CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:creationDate ?date .
FILTER(knora-api:toSimpleDate(?date) = "GREGORIAN:1756-01-03 CE"^^knora-api-simple:Date)
?letter beol:letterHasLanguage ?language .
?language knora-api:valueAsString "German" .
?letter beol:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a knora-api:StandoffLinkTag .
?standoffTag knora-api:standoffTagHasLink ?person .
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
} Could it be that // get all the IRIs for variables representing dependent resources per main resource
val dependentResourceIrisPerMainResource: Map[IRI, Set[IRI]] = prequeryResponse.results.bindings.foldLeft(Map.empty[IRI, Set[IRI]]) {
case (acc: Map[IRI, Set[IRI]], resultRow: VariableResultsRow) =>
// collect all the values for the current main resource from prequery response
val mainResIri: String = resultRow.rowMap(mainResourceVar.variableName)
val dependentResIris: Set[IRI] = dependentResourceVariablesConcat.flatMap {
dependentResVar: QueryVariable =>
// check if key exists (the variable could be contained in an OPTIONAL or a UNION)
val dependentResIriOption: Option[IRI] = resultRow.rowMap.get(dependentResVar.variableName)
dependentResIriOption match {
case Some(depResIri: IRI) =>
// IRIs are concatenated, split them
depResIri.split(nonTriplestoreSpecificConstructToSelectTransformer.groupConcatSeparator).toSeq
case None => Set.empty[IRI] // no value present
}
}
acc + (mainResIri -> dependentResIris)
} |
…lable in simple schema.
Are you logged in? |
@benjamingeer Do you want me to have look today? |
Yes please. |
@benjamingeer I think there is a problem in the way the presence is checked for everything specified in the Where clause after permission checking has been done. I would have to look at it more carefully another time. However, as a temporary solution just bypass it: -> Like this, I get the result which means that the query works :-) |
Ok, this where the resource is filtered out: Maybe it is related to the collection of dependent resource Iris (as you suspected #910 (comment)). I will have a look at it! |
That would be great, this PR can't be merged before this is fixed. |
Note for myself:
->
It seems that the main query gets information about the linked person, but the link value is not included as a nested resource. |
@benjamingeer I have an idea, but I would still need to verify that: I think the person is not included as a nested resource because the Maybe this is logic that has to be added to the |
- Fix tests to make them deterministic so they work with GraphDB 8.6.0.
Gravsearch re-implementation of full-text search restricted to a standoff tag: PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>
CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:hasText ?text .
?text knora-api:valueAsString ?textStr .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a standoff:StandoffParagraphTag .
FILTER(knora-api:matchInStandoff(?textStr, ?standoffTag, "Euler"))
} |
Which generates this prequery: SELECT DISTINCT ?letter (GROUP_CONCAT(DISTINCT(?text); SEPARATOR='') AS ?text__Concat)
WHERE {
?letter <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?letter <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?letter <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/0801/beol#letter> .
?letter <http://www.knora.org/ontology/0801/beol#hasText> ?text .
GRAPH <http://www.ontotext.com/explicit> {
?text <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?text <http://www.knora.org/ontology/knora-base#valueHasString> ?textStr .
?text <http://www.knora.org/ontology/knora-base#valueHasStandoff> ?standoffTag .
?standoffTag <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/standoff#StandoffParagraphTag> .
GRAPH <http://www.ontotext.com/explicit> {
?textStr <http://www.ontotext.com/owlim/lucene#fullTextSearchIndex> "Euler"^^<http://www.w3.org/2001/XMLSchema#string> .
}
GRAPH <http://www.ontotext.com/explicit> {
?standoffTag <http://www.knora.org/ontology/knora-base#standoffTagHasStart> ?standoffTag__start .
}
GRAPH <http://www.ontotext.com/explicit> {
?standoffTag <http://www.knora.org/ontology/knora-base#standoffTagHasEnd> ?standoffTag__end .
}
BIND(substr(?textStr, ?standoffTag__start + 1, ?standoffTag__end - ?standoffTag__start) AS ?standoffTag__markedUp)
FILTER(regex(?standoffTag__markedUp, "Euler", "i"))
}
GROUP BY ?letter
ORDER BY ASC(?letter)
LIMIT 25 |
After discussion with @tobiasschweizer: How to deal with the permissions issue: try adding statements to match the |
Another thing to keep in mind is the Apache Lucene Syntax: https://lucene.apache.org/core/3_6_2/queryparsersyntax.html and the syntax for the SPARQL Regex function since we pass the same search term to them (wildcards etc.). |
The problem is how to find out what resource is the source of the link here:
The search responder could generate:
But this would be redundant if the user has already put statements like that in the query. I don't know if this would make the query slower. I can test it. The alternative would be to force the user to use a custom function like this:
Which would generate the same thing. |
It works! PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>
CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:creationDate ?date .
FILTER(knora-api:toSimpleDate(?date) = "GREGORIAN:1756-01-03 CE"^^knora-api-simple:Date)
?letter beol:letterHasLanguage ?language .
?language knora-api:valueAsString "German" .
?letter beol:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?letter, ?standoffTag, ?person)
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
} |
@benjamingeer thx for fixing this. I will add an R2R for the Iri case |
Once that code in |
…ecified for the target resource
This is why I started: #926 |
@benjamingeer Given that the tests pass now, I suggest that I finish the review of this PR today and then do the refactoring in a new PR that I start this week. Would that be ok for you? I would then also try to figure out how a dependent resource could be returned when the user asks for a text value with a standoff a link and wants the text value to be included in the results. |
@tobiasschweizer OK, that would be great. |
Doesn't this work already? If not, can you give an example? |
see for example the test "search for a standoff link using the knora-api:standoffLink function (submitting the complex schema)" PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX anything: <http://0.0.0.0:3333/ontology/0001/anything/v2#>
CONSTRUCT {
?thing knora-api:isMainResource true .
?thing anything:hasText ?text .
} WHERE {
?thing a anything:Thing .
?thing anything:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?thing, ?standoffTag, ?otherThing)
?otherThing a anything:Thing .
} it is expected to return: {
"@graph" : [ {
"@id" : "http://rdfh.ch/0001/a-thing-with-text-values",
"@type" : "anything:Thing",
"anything:hasText" : [ {
"@id" : "http://rdfh.ch/0001/a-thing-with-text-values/values/1",
"@type" : "knora-api:TextValue",
"knora-api:textValueAsXml" : "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<text>Ich liebe die <a href=\"http://rdfh.ch/0001/a-thing\" class=\"salsah-link\">Dinge</a>, sie sind alles für mich.</text>",
"knora-api:textValueHasMapping" : "http://rdfh.ch/standoff/mappings/StandardMapping"
}, {
"@id" : "http://rdfh.ch/0001/a-thing-with-text-values/values/2",
"@type" : "knora-api:TextValue",
"knora-api:textValueAsXml" : "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<text>Na ja, die <a href=\"http://rdfh.ch/0001/a-thing\" class=\"salsah-link\">Dinge</a> sind OK.</text>",
"knora-api:textValueHasMapping" : "http://rdfh.ch/standoff/mappings/StandardMapping"
} ],
"rdfs:label" : "Ein Ding für jemanden, dem die Dinge gefallen"
}, {
"@id" : "http://rdfh.ch/0001/project-thing-1",
"@type" : "anything:Thing",
"anything:hasText" : {
"@id" : "http://rdfh.ch/0001/project-thing-1/values/2",
"@type" : "knora-api:TextValue",
"knora-api:textValueAsXml" : "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<text>This is a <a href=\"http://rdfh.ch/0001/project-thing-2\" class=\"salsah-link\">standoff link</a>.</text>",
"knora-api:textValueHasMapping" : "http://rdfh.ch/standoff/mappings/StandardMapping"
},
"rdfs:label" : "A thing that only project members can see"
} ],
"@context" : {
"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"rdfs" : "http://www.w3.org/2000/01/rdf-schema#",
"knora-api" : "http://api.knora.org/ontology/knora-api/v2#",
"anything" : "http://0.0.0.0:3333/ontology/0001/anything/v2#"
}
} I think when the user expects to get back the text My rationale is the following: if there are no other statements about that target resource, how could it be included in the CONSTRUCT clause if the user wants to see it? |
Why? If the user wants to get that information, why can't they just add one line to the
Maybe they don't actually want that information. Perhaps the target resource is an IRI, and the user already knows all about it, so they don't need it in the search result.
You mean in the
And the |
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX anything: <http://0.0.0.0:3333/ontology/0001/anything/v2#>
CONSTRUCT {
?thing knora-api:isMainResource true .
?thing anything:hasText ?text .
?otherThing a anything:Thing .
} WHERE {
?thing a anything:Thing .
?thing anything:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?thing, ?standoffTag, ?otherThing)
?otherThing a anything:Thing .
}
|
It's not a problem if you make additional statements like |
Oh yeah, of course, I'm stupid. Hmm. Then I guess there are two options. The user could add this to the
It would work because the Or we could add that automatically. But I'd rather give the user control over it. |
With the current design, adding Maybe I can have a look at how this logic could be improved so that this would work. |
propertyIri = propertyIri, | ||
subPropertyOf = subPropertyOf, | ||
predicates = otherPreds, | ||
ontologySchema = propertyIri.getOntologySchema.get | ||
) | ||
|
||
if (!propertyIri.isKnoraBuiltInDefinitionIri && propertyDef.getRdfTypes.contains(OntologyConstants.Owl.TransitiveProperty.toSmartIri)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sure that you cannot use owl:transitiveProperty
in a project ontology because of the reasons outlined in FAQ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Or we could just make the user add it to the |
I would be ok with that. Could you add that to the docs about the |
@benjamingeer btw, I fixed this #889 (comment) in ed03eb7 |
|
||
TransformedFilterPattern( | ||
None, // FILTER has been replaced with statements | ||
linkStatements ++ statementsForLinkValue ++ statementsForTargetResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we have to care about redundant triples?
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX anything: <http://0.0.0.0:3333/ontology/0001/anything/v2#>
CONSTRUCT {
?thing knora-api:isMainResource true .
?thing anything:hasText ?text .
} WHERE {
?thing a anything:Thing .
?thing anything:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffTag .
?standoffTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?thing, ?standoffTag, ?otherThing)
?otherThing a anything:Thing .
}
prequery:
SELECT DISTINCT ?thing (GROUP_CONCAT(DISTINCT(?otherThing); SEPARATOR='') AS ?otherThing__Concat) (GROUP_CONCAT(DISTINCT(?text); SEPARATOR='') AS ?text__Concat) (GROUP_CONCAT(DISTINCT(?otherThing__LinkValue); SEPARATOR='') AS ?otherThing__LinkValue__Concat)
WHERE {
?thing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?thing <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?thing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/0001/anything#Thing> .
?thing <http://www.knora.org/ontology/0001/anything#hasText> ?text .
GRAPH <http://www.ontotext.com/explicit> {
?text <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?text <http://www.knora.org/ontology/knora-base#valueHasStandoff> ?standoffTag .
?standoffTag <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#StandoffLinkTag> .
?otherThing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?otherThing <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?otherThing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/0001/anything#Thing> .
GRAPH <http://www.ontotext.com/explicit> {
?thing <http://www.knora.org/ontology/knora-base#hasStandoffLinkTo> ?otherThing .
}
?standoffTag <http://www.knora.org/ontology/knora-base#standoffTagHasLink> ?otherThing .
?thing <http://www.knora.org/ontology/knora-base#hasStandoffLinkToValue> ?otherThing__LinkValue .
GRAPH <http://www.ontotext.com/explicit> {
?otherThing__LinkValue <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#LinkValue> .
}
GRAPH <http://www.ontotext.com/explicit> {
?otherThing__LinkValue <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
GRAPH <http://www.ontotext.com/explicit> {
?otherThing__LinkValue <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> ?thing .
}
GRAPH <http://www.ontotext.com/explicit> {
?otherThing__LinkValue <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> ?otherThing .
}
?otherThing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?otherThing <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
}
GROUP BY ?thing
ORDER BY ASC(?thing)
LIMIT 25
Note that
?otherThing <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?otherThing <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
is redundant. This is because there are two statements involving ?otherThing
.
Could we simply remove duplicates from the sequence of statements before creating the WHERE clause for the prequery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is redundant. This is because there are two statements involving ?otherThing.
Do you know where those statements are generated?
I would guess it's not a big problem to have redundant statements, but I've been trying to avoid generating them, to make the generated SPARQL easier to read. I added some collections to AbstractSparqlTransformer
for this.
We could remove duplicates, but I think it would be more work, because we'd have to traverse the whole syntax tree, removing duplicates at each level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createAdditionalStatementsForNonPropertyType
is called twice: once from within handleStandoffLinkFunction
for ?otherThing
and once because of ?otherThing a anything:Thing .
. We call it from within handleStandoffLinkFunction
because there could be no statement ?otherThing a anything:Thing .
and then the logic would not work correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is because we look at statements one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if createAdditionalStatementsForNonPropertyType
maintains a mutable set of statements that has already generated, so it can avoid generating them redundantly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do that as an optimization in the refactoring we have in mind anyway?
Then we could merge this PR, so it does not block you from what you have to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add the idea to #926
@@ -1869,7 +1912,7 @@ class SearchRouteV2R2RSpec extends R2RSpec { | |||
|
|||
} | |||
|
|||
"search for a anything:Thing with a list value" in { | |||
"search for an anything:Thing with a list value" ignore { // ignored because the list node label returned is not deterministic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should get rid of the list node label and query the list v2 route instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you wanted to do that in another PR, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, #763 would be a good occasion to do that. Could we work on it together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standoff: What emerged as a crazy idea, has now an even crazier implementation. Weiter so!
Thank you! :) |
I am glad this works now with the current design. This proves that we can actually deal with non XML markup and make useful queries :-) |
knora-api-simple:Date
in aFILTER
.owl:TransitiveProperty
andstandoffTagHasStartAncestor
as per Implement Standoff Search #630 (comment).owl:TransitiveProperty
.To get support for
standoffTagHasStartAncestor
, you have to recreate your repository and restart GraphDB.Resolves #630.