Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gravsearch enhancements and fixes #870

Merged
merged 27 commits into from
Jun 5, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
33c5927
fix (Gravsearch): Allow the use of the lang function in an AND or OR.
May 31, 2018
04c6806
Merge branch 'develop' into wip/gravsearch-standoff
May 31, 2018
9ecea9d
feature (webapi): Accept Gravsearch queries as POST requests.
May 31, 2018
0b24a2f
docs (webapi): Update README.md.
May 31, 2018
cdbbf55
Merge branch 'develop' into wip/gravsearch-standoff
Jun 1, 2018
64387ad
feature (webapi): Accept Gravsearch queries where the main resource i…
Jun 1, 2018
7986eb1
feature (webapi): Undo first attempt at accepting Gravsearch query wi…
Jun 4, 2018
25e22ec
tests (gravsearch): add tests involving dcterms properties
Jun 4, 2018
f6c3abb
fix (gravsearch): Don't use IRIs from type inspector to check for pre…
Jun 4, 2018
6673e81
tests (gravsearch): add tests for link objects
Jun 4, 2018
8c4c50a
tests (gravsearch): add a test for a variable representing a property…
Jun 4, 2018
40c1fab
feature (gravsearch): Allow BIND in Gravsearch.
Jun 4, 2018
43bc7d3
test (gravsearch): Add test for BIND.
Jun 4, 2018
a9b20c0
feature (gravsearch): Use DISTINCT in GROUP_CONCAT.
Jun 4, 2018
b5e1369
tests (gravsearch): search for a person using foaf entities
Jun 4, 2018
a40c9e2
feature (gravsearch): If object of statement is variable also used in…
Jun 4, 2018
a995cda
fix (incoming links): handle case of several instances of a property
Jun 4, 2018
0e71e90
tests (gravsearch): check for several incoming links
Jun 4, 2018
a9d9700
tests (gravserach): remove comments
Jun 4, 2018
e7d4396
tests (gravsearch): add test data
Jun 4, 2018
64a1479
tests (gravsearch): add a test for multi-level references
Jun 4, 2018
bdf5361
tests (gravsearch): add test for a letter that links to a person with…
Jun 5, 2018
c1b97ed
fix (webapi): Fix parsing of FILTER in OPTIONAL (#879).
Jun 5, 2018
d1fc094
tests (gravsearch): add test with filter in an Optional clause
Jun 5, 2018
f30267d
Merge branch 'develop' into wip/gravsearch-standoff
tobiasschweizer Jun 5, 2018
eae6d2a
docs (gravsearch): Update docs and release notes.
Jun 5, 2018
52d926c
docs (gravsearch): Fix typo.
Jun 5, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 17 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

# Knora

[Knora](http://www.knora.org/) (Knowledge Organization, Representation, and Annotation) is a software
framework for storing, sharing, and working with primary sources and data in the humanities.
[Knora](http://www.knora.org/) (Knowledge Organization, Representation, and Annotation) is a server
application for storing, sharing, and working with primary sources and data in the humanities.

It is developed by the [Digital Humanities Lab](http://www.dhlab.unibas.ch/) at the [University of Basel](https://www.unibas.ch/en.html), and is supported by the [Swiss Academy of Humanities and Social Sciences](http://www.sagw.ch/en/sagw.html).

Expand All @@ -17,25 +17,25 @@ Knora is [free software](http://www.gnu.org/philosophy/free-sw.en.html), release
* Offers a generic HTTP-based API, implemented in [Scala](http://www.scala-lang.org/), for querying, annotating, and linking together heterogeneous data in a unified way.
* Handles authentication and authorization.
* Provides automatic versioning of data.
* Includes [Sipi](https://github.com/dhlab-basel/Sipi), a high-performance media server implemented in C++.
* Provides a general-purpose, browser-based Virtual Research Environment called SALSAH (to be released soon).
* Uses [Sipi](http://www.sipi.io/), a high-performance media server implemented in C++.
* Designed to be be used with [SALSAH](https://dhlab-basel.github.io/Salsah/), a general-purpose, browser-based virtual research environment,
as well as with custom user interfaces.

## Status

### Beta stage
### Stable

* The OWL ontologies
* API operations for querying and updating data
* API operations dealing with binary files and Sipi
* The testing framework, which includes many tests
* Integration of the SALSAH GUI
* API operations for administering Knora
* Documentation
* [Knora Ontologies](https://docs.knora.org/paradox/02-knora-ontologies/index.html)
* [Knora API v1](https://docs.knora.org/paradox/03-apis/api-v1/index.html)

### Planned
### Beta stage

* [Knora Admin API](https://docs.knora.org/paradox/03-apis/api-admin/index.html)
* Distribution packaging using [Docker](https://www.docker.com/)
* A simple GUI for creating ontologies (for now you can use an application such as [Protégé](http://protege.stanford.edu/) or [TopBraid Composer](http://www.topquadrant.com/tools/modeling-topbraid-composer-standard-edition/))

### New features under development

* [Knora API v2](https://docs.knora.org/paradox/03-apis/api-v2/index.html)

## Requirements

Expand All @@ -45,7 +45,8 @@ Knora is [free software](http://www.gnu.org/philosophy/free-sw.en.html), release
* [Java Development Kit 8](http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)
* [SBT](http://www.scala-sbt.org/)

[Ontotext GraphDB](http://ontotext.com/products/graphdb/) is recommended.
[Ontotext GraphDB](http://ontotext.com/products/graphdb/) is recommended. Support for
other RDF triplestores is planned.

### For building the documentation

Expand Down Expand Up @@ -129,9 +130,7 @@ Whenever you add a new feature or fix a bug, you should add one or more tests fo

### Documentation

A pull request should include tests and documentation for the changes that were made. Design and user documentation go under `docs` and are written in [reStructuredText](http://docutils.sourceforge.net/rst.html) format using the [Sphinx](http://www.sphinx-doc.org/en/stable/) documentation generator.


A pull request should include tests and documentation for the changes that were made. See the [documentation README](https://github.com/dhlab-basel/Knora/blob/develop/docs/Readme.md) for information on writing Knora documentation.

## Contact information

Expand Down
2 changes: 1 addition & 1 deletion docs/src/jekyll/Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ GEM
eventmachine (>= 0.12.9)
http_parser.rb (~> 0.6.0)
eventmachine (1.2.7)
ffi (1.9.23)
ffi (1.9.25)
forwardable-extended (2.6.0)
http_parser.rb (0.6.0)
i18n (0.9.5)
Expand Down
11 changes: 11 additions & 0 deletions docs/src/paradox/00-release-notes/v1.6.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,18 @@ Required changes to existing data:
New features:
-------------

Gravsearch enhancements:

- Accept queries in POST requests (@github[#650](#650)).
- Allow a Gravsearch query to specify the IRI of the main resource (@github[#871](#871)) (by allowing `BIND`).
- Allow `lang` to be used with `!=`.

Bugfixes:
---------

Gravsearch fixes:

- Allow the `lang` function to be used in a comparison inside AND/OR (@github[#846](#846)).
- Fix the processing of resources with multiple incoming links that use the same property (@github[#878](#878)).
- Fix the parsing of a FILTER inside an OPTIONAL (@github[#879](#879)).
- Require the `match` function to be the top-level expression in a `FILTER`.
25 changes: 11 additions & 14 deletions docs/src/paradox/03-apis/api-v2/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,24 +100,21 @@ API server automatically converts back and forth between these internal
and external representations. This approach encapsulates the internals
and adds a layer of abstraction to them.

IRIs representing ontologies and ontology entities are different in different
schemas; see @ref:[Knora IRIs](knora-iris.md).

Some API operations inherently require the client to accept responses in
the complex schema, while others can return data in either schema. In
the latter case, the complex schema is used by default in the response,
unless the request specifically asks for the simple schema. For example,
if an ontology is requested using an IRI indicating the simple schema,
the ontology will be returned in the simple schema (see
@ref:[Querying, Creating, and Updating Ontologies](ontology-information.md)). The
client can also specify the desired schema by using an HTTP header or a
URL parameter:
the complex schema. For example, if an ontology is requested using an IRI
indicating the simple schema, the ontology will be returned in the simple schema (see
@ref:[Querying, Creating, and Updating Ontologies](ontology-information.md)).

Other API operations can return data in either schema. In this case, the
complex schema is used by default in the response, unless the request specifically
asks for the simple schema. The client can specify the desired schema by using
an HTTP header or a URL parameter:

- the HTTP header `X-Knora-Accept-Schema`
- the URL parameter `schema`

Both the HTTP header and the URL parameter accept the values `simple` or
`complex`.

Although the Gravsearch query language
(see @ref:[Gravsearch: Virtual Graph Search](query-language.md)) requires the simple
schema to be used in the request, search results are returned in the
complex schema by default, unless the client requests the simple schema
by using the HTTP header or the URL parameter.
121 changes: 97 additions & 24 deletions docs/src/paradox/03-apis/api-v2/query-language.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,22 +53,57 @@ It is certainly possible to write Gravsearch queries by hand, but we expect
that in general, they will be automatically generated by client
software, e.g. by a client user interface.

### Submitting Gravsearch Queries

The recommended way to submit a Gravsearch query is via HTTP POST:

```
HTTP POST to http://host/v2/searchextended
```

This works like [query via POST directly](https://www.w3.org/TR/sparql11-protocol/#query-via-post-direct)
in the [SPARQL 1.1 Protocol](https://www.w3.org/TR/sparql11-protocol/): the query
is sent unencoded as the HTTP request message body, in the UTF-8 charset.

It is also possible to submit a Gravsearch query using HTTP GET. The entire
query must be URL-encoded and included as the last element of the URL path:

```
HTTP GET to http://host/v2/searchextended/QUERY
```

The response to a Gravsearch query is an RDF graph, which can be requested in various
formats (see @ref:[Responses Describing Resources](reading-and-searching-resources.md#responses-describing-resources)).

To request the number of results rather than the results themselves, you can
do a count query:

```
HTTP POST to http://host/v2/searchextended/count
```

The response to a count query request is an object with one predicate,
`http://schema.org/numberOfItems`, with an integer value.

### Main and Dependent Resources

The main resource is the top-level resource in a search result. Other
resources that are in some way connected to the main resource are
referred to as dependent resources. If the client asks for a resource A
relating to a resource B, then all matches for A will be presented as
main resources and those for B as dependent resources.
main resources and those for B as dependent resources. The main resource
must be represented by a variable, marked with `knora-api:isMainResource`,
as explained under @ref:[CONSTRUCT Clause](#construct-clause).

### Graph Patterns and Result Graphs

The WHERE clause of a Gravsearch query specifies a graph pattern. Each query
result will match this graph pattern, and will have the form of a graph
whose starting point is a main resource. The query's graph pattern, and
hence each query result graph, can span zero more levels of relations
between resources. For example, a query could request articles by
authors who were students of a particular professor. Or authors of texts
between resources. For example, a query could request regions
in images on pages of books written by a certain author, articles by
authors who were students of a particular professor, or authors of texts
that refer to events that took place within a certain date range.

### Permission Checking
Expand All @@ -83,19 +118,21 @@ instead replaced by a proxy resource called

### Inference

Gravsearch queries are understood to imply RDFS reasoning. Depending on the
Gravsearch queries are understood to imply
[RDFS reasoning](https://www.w3.org/TR/rdf11-mt/). Depending on the
triplestore being used, this may be implemented using the triplestore's
own reasoner or by query expansion in Knora (using SPARQL
property path syntax). This means that if a statement pattern specifies
a property, the pattern will also match subproperties of that property.
own reasoner or by query expansion in Knora.

This means that if a statement pattern specifies a property, the pattern will
also match subproperties of that property, and if a statement specifies that
a subject has a particular `rdf:type`, the statement will also match subjects
belonging to subclasses of that type.

### API Schema

A Gravsearch query must be written using the Knora API simple schema (see
@ref:[Querying, Creating, and Updating Ontologies](ontology-information.md)).
However, results can be returned in the simple or complex schema. (The
ability to choose the response schema is not yet implemented, so for
now, results are always returned in the complex schema.)
A Gravsearch query must be written using the Knora API simple schema, but
results can be returned in the simple or complex schema; see
@ref:[API Schema](introduction.md#api-schema).

## Gravsearch Syntax

Expand Down Expand Up @@ -132,10 +169,17 @@ clauses use the following patterns, with the specified restrictions:
unordered set of triples. However, a Gravsearch query returns an
ordered list of resources, which can be ordered by the values of
specified properties.
- `BIND`: The value assigned must be a Knora data IRI.

#### Resources

Resources can be represented by an IRI or a query variable.
Resources can be represented either by an IRI or by a variable, except for the
main resource, which must be represented by a variable.

It is possible to do a Gravsearch query in which the IRI of the main resource
is already known, e.g. to request specific information about that resource and
perhaps about linked resources. In this case, the IRI of the main resource must
be assigned to a variable using `BIND`.

#### Properties

Expand All @@ -150,6 +194,31 @@ currently not supported as the objects of statement patterns in the
query. To restrict a value, a FILTER must be used. Without a FILTER, all
the instances of a value are returned.

#### Functions in FILTER Expressions

The function `knora-api:match` searches for matching words anywhere in a
text value, and is implemented using a full-text search index if available.
The words to be matched are separated by spaces in a string literal.
For example, to search for titles that contain the words 'Zeitglöcklein' and
'Lebens':

```
FILTER knora-api:match(?title, "Zeitglöcklein Lebens")
```

If `knora-api:match` is used in a `FILTER`, it must be the only expression in
the `FILTER`.

To filter a text value by language, use the SPARQL `lang` function,
e.g.:

```
FILTER(lang(?text) = "fr")
```

The [SPARQL `regex` function](https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#func-regex)
is also supported.

#### Required Type Annotations

Resources, properties, and values must be accompanied by explicit type
Expand All @@ -162,6 +231,12 @@ There are two type annotation properties:
that a property points to.
- `rdf:type`: indicates the type of a resource or value.

#### Resource Classes

Each variable representing a resource must be annotated with
`rdf:type knora-api:Resource`. To restrict the types of resources, additional
statements can be made using `rdfs:type`.

#### Property Types

A property may point either to a value or to a resource. In the first
Expand All @@ -183,11 +258,7 @@ Supported value property types:

##### Linking Property Types

A linking property has to be annotated with the type
`knora-api:Resource`. Since inference is assumed, this matches any
resource. To restrict the types of resources, additional statements can
be made using `rdfs:type`. The linking property can also be restricted
using a FILTER in case a query variable is used.
A linking property must be annotated with `knora-api:objectType knora-api:Resource`.

#### Value Types

Expand All @@ -214,16 +285,18 @@ values. Supported value types in FILTERs:
### CONSTRUCT Clause

The `CONSTRUCT` clause specifies which information the response should
return. The CONSTRUCT clause must contain at least one statement,
specifying `knora-api:isMainResource`. Any other statements in the
`CONSTRUCT` clause must also be present in the WHERE clause.
return. The CONSTRUCT clause must contain at least one statement whose subject
is a variable and whose predicate is `knora-api:isMainResource true`. Any other
statements in the `CONSTRUCT` clause must also be present in the WHERE clause.

The `rdfs:label` of each matching resource is always returned, so there is no
need to mention it in the query.

#### Marking the Main Resource

In the `CONSTRUCT` clause of a Gravsearch query, the variable representing the
main resource that the user is interested in must be indicated with
`knora-api:isMainResource true`. Exactly one variable representing a
resource must be marked in this way.
main resource must be indicated with `knora-api:isMainResource true`. Exactly
one variable representing a resource must be marked in this way.

## Gravsearch by Example

Expand Down
Loading