Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query standoff markup in Gravsearch #910

Merged
merged 36 commits into from
Jul 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
482d816
docs (gravsearch): Correct doc.
Jun 25, 2018
6531aa9
feature (webapi): Add config file for testing with a locally compiled…
Jun 26, 2018
2a8a942
docs (webapi): Document running a locally-compiled Sipi with Knora's …
Jun 26, 2018
1fd866d
feature (webapi): Query standoff tags (ongoing).
Jun 26, 2018
18c33f7
test (webapi): Fix ontology route tests now that standoff is not avai…
Jun 26, 2018
0d1edef
feature (gravsearch): Add a custom function for searching for terms i…
Jun 28, 2018
af42f6d
test (gravsearch): Fix test compile error.
Jun 28, 2018
7031c16
test (webapi): Suppress debugging output.
Jun 28, 2018
269bd7e
fix (gravsearch): Fix bug in lang function.
Jun 28, 2018
d2fe7f4
Merge branch 'develop' into wip/gravsearch-standoff
Jun 29, 2018
1797b89
feature (webapi): Support searching for standoff link tags.
Jun 29, 2018
f9c3f10
Merge branch 'develop' into wip/gravsearch-standoff
Jun 29, 2018
36e3ff2
test (gravsearch): Add tests for new functions.
Jun 29, 2018
eb86a5e
docs (gravsearch): Document standoff functions.
Jun 29, 2018
e2ea25a
docs (gravsearch): Make standoff example more interesting.
Jun 29, 2018
d58c614
feature (gravsearch): Search for a standoff date tag.
Jul 2, 2018
a4d6184
test (gravsearch): Add test for searching for a date tag.
Jul 2, 2018
68e7fce
feature (gravsearch): Support knora-api:standoffTagHasStartAncestor.
Jul 2, 2018
a94220e
docs (release notes): Update release notes.
Jul 2, 2018
b9a231d
fix (gravsearch): Make function arguments typeable outside comparison…
Jul 2, 2018
9d1492b
style (webapi): Optimise imports.
Jul 2, 2018
993f504
feature (webapi): Don't allow project-specific ontologies to use owl:…
Jul 3, 2018
a810b48
build (docs): Try to fix Travis build error.
Jul 3, 2018
246b9b5
Merge branch 'develop' into wip/gravsearch-standoff
Jul 5, 2018
4906dbd
fix (gravsearch): Infer the type of an IRI used in a function in a FI…
Jul 5, 2018
2ef9c46
fix (gravsearch): Find more inconsistent types in FILTERs.
Jul 5, 2018
6e6d2de
fix (SearchResponder): debugging output
Jul 6, 2018
6ecbf8e
fix (Gravsearch): add third argument of function standoffLink to coll…
Jul 9, 2018
4d21a36
tests (Gravsearch): use of a resource query variable in function stan…
Jul 9, 2018
b220b33
refactor (Gravsearch): create statements for target resource
Jul 9, 2018
848dca2
fix (gravsearch): Include IRIs used as function arguments in positive…
Jul 9, 2018
add3ef9
test (gravsearch): Add test with IRI argument in function.
Jul 9, 2018
4922ec4
tests (Gravsearch): tests for standoffLink function when an Iri is sp…
Jul 9, 2018
3212ef3
fix (build): Try to fix Travis build error.
Jul 9, 2018
ed03eb7
docs (Gravsearch): document Uri value
Jul 9, 2018
4dcb8d1
docs (gravsearch): Document how to get standoff link tag target in se…
Jul 9, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ jobs:
- sudo apt-get install -y graphviz software-properties-common
- sudo apt-add-repository -y ppa:rael-gc/rvm
- sudo apt-get update
- sudo apt-get install rvm
- sudo apt-get --allow-unauthenticated install rvm
- rvm reload
- rvm install ruby 2.5.1
- rvm use 2.5.1 --default
Expand Down
5 changes: 5 additions & 0 deletions docs/src/paradox/00-release-notes/v1.7.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@ See the
Required changes to existing data:
----------------------------------

- To use the inferred Gravsearch predicate `knora-api:standoffTagHasStartAncestor`,
you must recreate your repository with the updated `KnoraRules.pie`.

New features:
-------------

- Gravsearch queries can now match standoff markup (@github[#910](#910)).

Bugfixes:
---------
2 changes: 1 addition & 1 deletion docs/src/paradox/03-apis/api-v1/xml-to-standoff-mapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The two cases are described in the TypeScript interfaces `simpletext`
and `richtext` in module `basicMessageComponents`.

Knora offers a standard mapping with the IRI
`http://data.knora.org/projects/standoff/mappings/StandardMapping`. The
`http://rdfh.ch/standoff/mappings/StandardMapping`. The
standard mapping covers the HTML elements and attributes supported by
the GUI's text editor, [CKEditor](https://ckeditor.com/). (Please note that the HTML has to be
encoded in strict XML syntax. CKeditor offers the possibility to define filter rules.
Expand Down
203 changes: 199 additions & 4 deletions docs/src/paradox/03-apis/api-v2/query-language.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ A Gravsearch query can be written in either of the two
@ref:[Knora API v2 schemas](introduction.md#api-schema). The simple schema
is easier to work with, and is sufficient if you don't need to query
anything below the level of a Knora value. If your query needs to refer to
list nodes, you must use the complex schema. Each query must use a single
list nodes or standoff markup, you must use the complex schema. Each query must use a single
schema, with one exception (see @ref:[Date Comparisons](#date-comparisons)).

Gravsearch query results can be requested in the simple or complex schema;
Expand Down Expand Up @@ -176,9 +176,9 @@ belonging to subclasses of that type.
Every Gravsearch query is a valid SPARQL 1.1
[CONSTRUCT](https://www.w3.org/TR/sparql11-query/#construct) query.
However, Gravsearch only supports a subset of the elements that can be used
in a SPARQL Construct query. Additionally, Gravsearch requires the client to
use explicit type annotations, explained below; these are valid SPARQL,
but specific to the Knora API. Also, the main resource has to be marked.
in a SPARQL Construct query, and a Gravsearch
@ref:[CONSTRUCT Clause](#construct-clause) has to indicate which variable
is to be used for the main resource in each search result.

### Supported SPARQL Syntax

Expand Down Expand Up @@ -239,6 +239,7 @@ The following Knora value types can be compared with literals in `FILTER`
expressions in the simple schema:

- Text values (`xsd:string`)
- Uri values (`xsd:anyURI`)
- Integer values (`xsd:integer`)
- Decimal values (`xsd:decimal`)
- Boolean values (`xsd:boolean`)
Expand Down Expand Up @@ -308,6 +309,9 @@ CONSTRUCT {
} ORDER BY ?pubdate
```

You can also use `knora-api:toSimpleDate` with to search for date tags in standoff
text markup (see @ref:[Matching Standoff Dates](#matching-standoff-dates)).

#### Searching for Matching Words

The function `knora-api:match` searches for matching words anywhere in a
Expand Down Expand Up @@ -368,6 +372,197 @@ In the complex schema, use it on the object of the text value's
FILTER regex(?titleStr, "Zeit", "i")
```

### Searching for Text Markup

To refer to standoff markup in text values, you must write your query in the complex
schema.

A `knora-api:TextValue` can have the property
`knora-api:textValueHasStandoff`, whose objects are the standoff markup
tags in the text. You can match the tags you're interested in using
`rdf:type` or other properties of each tag.

#### Matching Text in a Standoff Tag

The function `knora-api:matchInStandoff` searches for standoff tags containing certain terms.
The implementation is optimised using the full-text search index if available. The
function takes three arguments:

1. A variable representing the string literal value of a text value.
2. A variable representing a standoff tag.
3. A string literal containing space-separated search terms.

This function can only be used as the top-level expression in a `FILTER`.
For example:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>

CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:hasText ?text .
?text knora-api:valueAsString ?textStr .
?text knora-api:textValueHasStandoff ?standoffParagraphTag .
?standoffParagraphTag a standoff:StandoffParagraphTag .
FILTER knora-api:matchInStandoff(?textStr, ?standoffParagraphTag, "Grund Richtigkeit")
}
```

Here we are looking for letters containing the words "Grund" and "Richtigkeit"
within a single paragraph.

#### Matching Standoff Links

If you are only interested in specifying that a resource has some text
value containing a standoff link to another resource, the most efficient
way is to use the property `knora-api:hasStandoffLinkTo`, whose subjects and objects
are resources. This property is automatically maintained by Knora. For example:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>

CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:hasText ?text .
?letter knora-api:hasStandoffLinkTo ?person .
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
}
```

Here we are looking for letters containing a link to the historian
Claude Jordan, who is identified by his Integrated Authority File
identifier, `(VIAF)271899510`.

However, if you need to specify the context in which the link tag occurs, you must
use the function `knora-api:standoffLink`. It takes three arguments:

1. A variable or IRI representing the resource that is the source of the link.
2. A variable representing the standoff link tag.
3. A variable or IRI representing the resource that is the target of the link.

This function can only be used as the top-level expression in a `FILTER`.
For example:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>

CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
} WHERE {
?letter a beol:letter .
?letter beol:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffLinkTag .
?standoffLinkTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?letter, ?standoffLinkTag, ?person)
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
?standoffLinkTag knora-api:standoffTagHasStartParent ?standoffItalicTag .
?standoffItalicTag a standoff:StandoffItalicTag .
}
```

This has the same effect as the previous example, except that because we are matching
the link tag itself, we can specify that its immediate parent is a
`StandoffItalicTag`.

If you actually want to get the target of the link (in this example, `?person`)
in the search results, you need to add a statement like
`?letter knora-api:hasStandoffLinkTo ?person .` to the `WHERE` clause and to the
`CONSTRUCT` clause:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX beol: <http://0.0.0.0:3333/ontology/0801/beol/v2#>

CONSTRUCT {
?letter knora-api:isMainResource true .
?letter beol:hasText ?text .
?letter knora-api:hasStandoffLinkTo ?person .
} WHERE {
?letter a beol:letter .
?letter beol:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffLinkTag .
?standoffLinkTag a knora-api:StandoffLinkTag .
FILTER knora-api:standoffLink(?letter, ?standoffLinkTag, ?person)
?person a beol:person .
?person beol:hasIAFIdentifier ?iafIdentifier .
?iafIdentifier knora-api:valueAsString "(VIAF)271899510" .
?standoffLinkTag knora-api:standoffTagHasStartParent ?standoffItalicTag .
?standoffItalicTag a standoff:StandoffItalicTag .
?letter knora-api:hasStandoffLinkTo ?person .
}
```

#### Matching Standoff Dates

You can use the `knora-api:toSimpleDate` function (see @ref[Date Comparisons](#date-comparisons))
to match dates in standoff date tags, i.e. instances of `knora-api:StandoffDateTag` or
of one of its subclasses. For example, here we are looking for a text containing
an `anything:StandoffEventTag` (which is a project-specific subclass of `knora-api:StandoffDateTag`)
representing an event that occurred sometime during the month of December 2016:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX anything: <http://0.0.0.0:3333/ontology/0001/anything/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>

CONSTRUCT {
?thing knora-api:isMainResource true .
?thing anything:hasText ?text .
} WHERE {
?thing a anything:Thing .
?thing anything:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffEventTag .
?standoffEventTag a anything:StandoffEventTag .
FILTER(knora-api:toSimpleDate(?standoffEventTag) = "GREGORIAN:2016-12 CE"^^knora-api-simple:Date)
}
```

#### Matching Ancestor Tags

Suppose we want to search for a standoff date in a paragraph, but we know
that the paragraph tag might not be the immediate parent of the date tag.
For example, the date tag might be in an italics tag, which is in a paragraph
tag. In that case, we can use the inferred property
`knora-api:standoffTagHasStartAncestor`. We can modify the previous example to
do this:

```
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
PREFIX standoff: <http://api.knora.org/ontology/standoff/v2#>
PREFIX anything: <http://0.0.0.0:3333/ontology/0001/anything/v2#>
PREFIX knora-api-simple: <http://api.knora.org/ontology/knora-api/simple/v2#>

CONSTRUCT {
?thing knora-api:isMainResource true .
?thing anything:hasText ?text .
} WHERE {
?thing a anything:Thing .
?thing anything:hasText ?text .
?text knora-api:textValueHasStandoff ?standoffDateTag .
?standoffDateTag a knora-api:StandoffDateTag .
FILTER(knora-api:toSimpleDate(?standoffDateTag) = "GREGORIAN:2016-12-24 CE"^^knora-api-simple:Date)
?standoffDateTag knora-api:standoffTagHasStartAncestor ?standoffParagraphTag .
?standoffParagraphTag a standoff:StandoffParagraphTag .
}
```

### CONSTRUCT Clause

In the `CONSTRUCT` clause of a Gravsearch query, the variable representing the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,8 +119,7 @@ character:
With each character added to the last term, the selection gets more
specific. The first term should at least contain four characters. To
make this kind of "search as you type" possible, a wildcard character is
automatically added to the last search
term.
automatically added to the last search term.

HTTP GET to http://host/v2/searchbylabel/searchValue[limitToResourceClass=resourceClassIRI]
[limitToProject=projectIRI][offset=Integer]
Expand Down Expand Up @@ -158,16 +157,16 @@ HTTP GET to http://host/v2/search/searchValue[limitToResourceClass=resourceClass
[limitToStandoffClass=standoffClassIri][limitToProject=projectIRI][offset=Integer]
```

Please note that the first parameter has to be preceded by a question
The first parameter has to be preceded by a question
mark `?`, any following parameter by an ampersand `&`.

A search value must have a minimal length of three characters (default value) as defined in `app/v2` in `application.conf`.

A search term may contain wildcards. A `?` represents a single character. It has to be URL-encoded as `%3F` since it has a special meaning in the URL syntax. For example, the term `Uniform` can be search for like this:
A search term may contain wildcards. A `?` represents a single character. It has to be URL-encoded as `%3F` since it has a special meaning in the URL syntax. For example, the term `Uniform` can be search for like this:

```
HTTP GET to http://host/v2/search/Unif%3Frm
```
```

A `*` represents zero, one or multiple characters. For example, the term `Uniform` can be searched for like this:

Expand All @@ -190,6 +189,9 @@ do a count query:
HTTP GET to http://host/v2/search/count/searchValue[limitToResourceClass=resourceClassIRI][limitToStandoffClass=standoffClassIri][limitToProject=projectIRI][offset=Integer]
```

The first parameter has to be preceded by a question
mark `?`, any following parameter by an ampersand `&`.

The response to a count query request is an object with one predicate,
`http://schema.org/numberOfItems`, with an integer value.

Expand All @@ -200,4 +202,4 @@ called @ref:[Gravsearch: Virtual Graph Search](query-language.md)).

### Support of TEI/XML

To convert standoff markup to TEI/XML, see @ref:[TEI/XML](tei-xml.md).
To convert standoff markup to TEI/XML, see @ref:[TEI/XML](tei-xml.md).
7 changes: 4 additions & 3 deletions docs/src/paradox/05-internals/design/triplestore-updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ We can assume that each SPARQL update operation will run in its own
database transaction with an isolation level of 'read committed'. This
is what GraphDB does when it receives a SPARQL update over HTTP (see
[GraphDB SE
Transactions](http://graphdb.ontotext.com/documentation/free/storage.html#transaction-control)).
Transactions](http://graphdb.ontotext.com/documentation/standard/storage.html#transaction-control)).
We cannot assume that it is possible to run more than one SPARQL update
in a single database transaction. (The [SPARQL 1.1
Protocol](http://www.w3.org/TR/sparql11-protocol/) does not provide a
Expand Down Expand Up @@ -256,9 +256,10 @@ is OK to add the data, and that both updates would then succeed,
inserting redundant data and possibly violating ontology constraints.
Therefore, Knora uses short-lived, application-level write locks on
resources, to ensure that only one request at a time can update a given
resource. Before each update, the application acquires a resource lock.
resource. Before each update, the application acquires a lock on a resource.
To prevent deadlocks, Knora locks only one resource per API operation.
It then does the pre-update checks and the update, then releases the
lock. The lock implementation (in `ResourceLocker`) requires each API
lock. The lock implementation (in `IriLocker`) requires each API
request message to include a random UUID, which is generated in the
@ref:[API Routing](design-overview.md#api-routing) package. Using
application-level locks allows us to do pre-update checks in their own
Expand Down
11 changes: 11 additions & 0 deletions docs/src/paradox/faq-fig1.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
digraph {
rankdir = LR

node [style = filled, fontcolor = white]

Foo1 [color = navy, fillcolor = slateblue4]
Foo2 [color = navy, fillcolor = slateblue4]

Foo1 -> Foo2 [label = "hasLinkToFoo", fontsize = 11, color = cyan4]

}
15 changes: 15 additions & 0 deletions docs/src/paradox/faq-fig2.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
digraph {
rankdir = LR

node [style = filled, fontcolor = white]

Foo1 [color = navy, fillcolor = slateblue4]
Foo2 [color = navy, fillcolor = slateblue4]

Foo3 [color = navy, fillcolor = slateblue4]

Foo1 -> Foo2 [label = "hasLinkToFoo", fontsize = 11, color = cyan4]
Foo2 -> Foo3 [label = "hasLinkToFoo", fontsize = 11, color = cyan4]
Foo1 -> Foo3 [label = "hasLinkToFoo", fontsize = 11, color = cyan4]

}
Loading