Skip to content

Commit

Permalink
Implement smart IRIs (#662)
Browse files Browse the repository at this point in the history
* refactor (webapi): Start refactoring StringFormatter to clarify use cases.

* fix (webapi): Fix compile errors.

* refactor (webapi): Add SmartIri class to simplify IRI manipulation.

* refactor (webapi): Separate IRI conversions from generation of JSON-LD in ontology classes.

- Use SmartIri in constant ontologies, too.
- Fix a lot of compile errors (still many left).

* refactor (webapi): Use SmartIri for ontology and entity IRIs in API v2.

- Separate ontology info schema conversion from JSON-LD generation.

* feature (webapi): Cache SmartIri instances for better performance.

* refactor (webapi): Simplify IRI parsing, removing regular expressions.

- Remove IRI caching, because performance is good anyway.

* fix (webapi): Fix bugs in IRI schema conversion.

- Put back IRI caching, but only for Knora definition IRIs.
- I think OntologyV2R2RSpec test data is wrong (has DatatypeProperty where it should have ObjectProperty).

* fix (webapi): Fix more bugs in API schema conversions and in test data.

* fix (webapi): Fix more parsing bugs and broken tests.

* feature (webapi): Fix error-checking of IRIs in SearchParserV2.

- Cache some known non-Knora definition IRIs.

* refactor (webapi): Use SmartIri for resource classes in search messages.

* feature (webapi): Require API v2 simple IRIs in KnarQL.

* fix (webapi): Allow non-Knora IRIs in KnarQL, but not ApiV2WithValueObjects.

* feature (webapi): Data IRIs don't get an ontology schema.

- Don't parse new smart IRIs when it's not neceessary to do so.

* test (webapi): Start making tests more specific for StringFormatter.

* test (webapi): Improve SmartIri error checking and tests.

* feature (webapi): Return ontology IRI and labels.

* refactor (remove unnecessary code): remove dependent resource vars as order by criteria (deprecated)

* feature (webapi): Support pattern matching with SmartIri.

- Disable validation of ontology names until #667 is resolved.

* fix (webapi): Fix various bugs related to IRI conversions.

* docs (webapi): Add some design documentation about API v2, including SmartIri.

- Remove old API v2 plans doc, which has already been implemented and superseded.
- Add route for getting ontology metadata per project.
- Add R2R tests for getting ontology metadata.
- Rename "namedgraph" to "allentities" in ontology routes.
- Add KnoraContentV2 trait with toOntologySchema method.

* tests (KnarQL): get pages of a book ordered by their seqnum

* test (KnarQL R2R): add JSON-LD test data for test

* test (KnarQL R2R): add tests to get the next OFFSET

* tests (KnarQL R2R): add JSON-LD test data

* fix (KnarQL query): always inster statement for start JDN sicne it may be used for sorting

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* tests (KnarQL): add JSON-LD test data

* docs (webapi): Fix typos.
  • Loading branch information
Benjamin Geer authored Nov 20, 2017
1 parent 607d225 commit cc53ef8
Show file tree
Hide file tree
Showing 82 changed files with 6,604 additions and 2,875 deletions.
84 changes: 64 additions & 20 deletions docs/rst/knora-api-server/api_v2/knora-iris.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,19 +28,28 @@ The IRIs used in Knora repositories and in the Knora API v2 follow certain conve
Project Codes
-------------

A project code is a hexadecimal number of at least four digits, assigned by the DaSCH_ to uniquely identify a Knora project regardless of where it is hosted. Project codes are currently optional. It is recommended that new projects request a project code and use it in their ontology IRIs, to avoid possible future naming conflicts.
A project code is a hexadecimal number of at least four digits, assigned by the DaSCH_ to uniquely
identify a Knora project regardless of where it is hosted. Project codes are currently optional. It
is recommended that new projects request a project code and use it in their ontology IRIs, to avoid
possible future naming conflicts.

The range of project IDs from ``0000`` to ``00FF`` inclusive is reserved for local testing. Thus, the first useful project will be ``0100``.
The range of project IDs from ``0000`` to ``00FF`` inclusive is reserved for local testing. Thus,
the first useful project will be ``0100``.

IRIs for Ontologies and Ontology Entities
-----------------------------------------

Internal Ontology IRIs
^^^^^^^^^^^^^^^^^^^^^^

Starting with Knora API v2, Knora makes a distinction between internal and external ontologies. Internal ontologies are used in the triplestore, while external ontologies are used in the API. For each internal ontology, there is a corresponding external ontology. Some internal ontologies are built into Knora, while others are project-specific. The Knora API server automatically generates external ontologies based on project-specific internal ontologies.
Starting with Knora API v2, Knora makes a distinction between internal and external ontologies.
Internal ontologies are used in the triplestore, while external ontologies are used in the API. For
each internal ontology, there is a corresponding external ontology. Some internal ontologies are
built into Knora, while others are project-specific. The Knora API server automatically generates
external ontologies based on project-specific internal ontologies.

Each internal ontology has an IRI, which is also the IRI of the named graph that contains the ontology in the triplestore. An internal project-specific ontology IRI has the form:
Each internal ontology has an IRI, which is also the IRI of the named graph that contains the
ontology in the triplestore. An internal project-specific ontology IRI has the form:

::

Expand All @@ -52,14 +61,17 @@ For example, the ontology IRI based on project code ``0001`` and ontology name `

http://www.knora.org/ontology/0001/example

An ontology name must be a valid XML NCName_. The following names are reserved for built-in internal Knora ontologies:
An ontology name must be a valid XML NCName_. The following names are reserved for built-in internal
Knora ontologies:

- ``knora-base``
- ``standoff``
- ``salsah-gui``
- ``dc``

Names starting with ``knora`` are reserved for future built-in Knora ontologies. A project-specific ontology name may not start with the letter ``v`` followed by a digit, and may not contain these reserved words:
Names starting with ``knora`` are reserved for future built-in Knora ontologies. A project-specific
ontology name may not start with the letter ``v`` followed by a digit, and may not contain these
reserved words:

- ``knora``
- ``ontology``
Expand All @@ -68,43 +80,66 @@ Names starting with ``knora`` are reserved for future built-in Knora ontologies.
External Ontology IRIs
^^^^^^^^^^^^^^^^^^^^^^

Unlike internal ontology IRIs, external ontology IRIs are meant to be dereferenced as URLs. When an ontology IRI is dereferenced, the ontology itself can be served either in a machine-readable format or as human-readable documentation.
Unlike internal ontology IRIs, external ontology IRIs are meant to be dereferenced as URLs. When an
ontology IRI is dereferenced, the ontology itself can be served either in a machine-readable format
or as human-readable documentation.

The IRI of an external Knora ontology has the form:

::

http://HOST[:PORT]/ontology/[PROJECT_CODE/]ONTOLOGY_NAME/API_VERSION

For built-in ontologies, the host is always ``api.knora.org``. Otherwise, the hostname and port configured in ``application.conf`` under ``app.http.knora-api.host`` and ``app.http.knora-api.http-port`` are used (the port is omitted if it is 80).
For built-in ontologies, the host is always ``api.knora.org``. Otherwise, the hostname and port
configured in ``application.conf`` under ``app.http.knora-api.host`` and ``app.http.knora-api.http-
port`` are used (the port is omitted if it is 80).

This means that when a built-in external ontology IRI is dereferenced, the ontology can be served by a Knora API server running at ``api.knora.org``. When a project-specific external ontology IRI is dereferenced, the ontology can be served by the Knora API server that hosts the project. During development and testing, this could be ``localhost``.
This means that when a built-in external ontology IRI is dereferenced, the ontology can be served by
a Knora API server running at ``api.knora.org``. When a project-specific external ontology IRI is
dereferenced, the ontology can be served by the Knora API server that hosts the project. During
development and testing, this could be ``localhost``.

The name of an external ontology is the same as the name of the corresponding internal ontology, with one exception: the external form of ``knora-base`` is called ``knora-api``.
The name of an external ontology is the same as the name of the corresponding internal ontology,
with one exception: the external form of ``knora-base`` is called ``knora-api``.

The API version identifier indicates not only the version of the API, but also an API 'schema'. The Knora API v2 is available in two schemas:
The API version identifier indicates not only the version of the API, but also an API 'schema'. The
Knora API v2 is available in two schemas:

- A default schema, which is suitable both for reading and for editing data. The default schema represents values primarily as complex objects. Its version identifier is ``v2``.
- A simple schema, which is suitable for reading data but not for editing it. The simple schema facilitates interoperability between Knora ontologies and non-Knora ontologies, since it represents values primarily as literals. Its version identifier is ``simple/v2``.
- A default schema, which is suitable both for reading and for editing data. The default schema
represents values primarily as complex objects. Its version identifier is ``v2``.

- A simple schema, which is suitable for reading data but not for editing it. The simple schema
facilitates interoperability between Knora ontologies and non-Knora ontologies, since it
represents values primarily as literals. Its version identifier is ``simple/v2``.

Other schemas could be added in the future for more specific use cases.

When requesting an ontology, the client requests a particular schema. (This will also be true of most Knora API v2 requests: the client will be able to specify which schema the response should be provided in.)
When requesting an ontology, the client requests a particular schema. (This will also be true of
most Knora API v2 requests: the client will be able to specify which schema the response should be
provided in.)

For example, suppose a Knora API server is running at ``knora.example.org`` and hosts an ontology whose internal IRI is ``http://www.knora.org/ontology/0001/example``. That ontology can then be requested using either of these IRIs:
For example, suppose a Knora API server is running at ``knora.example.org`` and hosts an ontology
whose internal IRI is ``http://www.knora.org/ontology/0001/example``. That ontology can then be
requested using either of these IRIs:

- ``http://knora.example.org/ontology/0001/example/v2`` (for the default schema)
- ``http://knora.example.org/ontology/0001/example/simple/v2`` (for the simple schema)

While the internal ``example`` ontology refers to definitions in ``knora-base``, the external ``example`` ontology that is served by the API refers instead to a ``knora-api`` ontology, whose IRI depends on the schema being used:
While the internal ``example`` ontology refers to definitions in ``knora-base``, the external
``example`` ontology that is served by the API refers instead to a ``knora-api`` ontology, whose IRI
depends on the schema being used:

- ``http://api.knora.org/ontology/knora-api/v2`` (for the default schema)
- ``http://api.knora.org/ontology/knora-api/simple/v2`` (for the simple schema)

Ontology Entity IRIs
^^^^^^^^^^^^^^^^^^^^

Knora ontologies use 'hash namespaces' (see `URI Namespaces`_). This means that the IRI of an ontology entity (a class or property definition) is constructed by adding a hash character (``#``) to the ontology IRI, followed by the name of the entity. In Knora, an entity name must be a valid XML NCName_. Thus, if there is a class called ``ExampleThing`` in an ontology whose internal IRI is ``http://www.knora.org/ontology/0001/example``, that class has the following IRIs:
Knora ontologies use 'hash namespaces' (see `URI Namespaces`_). This means that the IRI of an
ontology entity (a class or property definition) is constructed by adding a hash character (``#``)
to the ontology IRI, followed by the name of the entity. In Knora, an entity name must be a valid
XML NCName_. Thus, if there is a class called ``ExampleThing`` in an ontology whose internal IRI is
``http://www.knora.org/ontology/0001/example``, that class has the following IRIs:

- ``http://www.knora.org/ontology/0001/example#ExampleThing`` (in the internal ontology)
- ``http://HOST[:PORT]/ontology/0001/example/v2#ExampleThing`` (in the API v2 default schema)
Expand All @@ -113,14 +148,23 @@ Knora ontologies use 'hash namespaces' (see `URI Namespaces`_). This means that
IRIs for Data
-------------

Knora generates IRIs for data that it creates in the triplestore. Each generated data IRI contains one or more UUID_ identifiers to make it unique. To keep data IRIs relatively short, each UUID is Base64_ encoded, using the 'URL and Filename safe Base64 Alphabet' specified in Table 2 of RFC 4648, without padding; thus each UUID is a 22-character string.
Knora generates IRIs for data that it creates in the triplestore. Each generated data IRI contains
one or more UUID_ identifiers to make it unique. To keep data IRIs relatively short, each UUID is
Base64_ encoded, using the 'URL and Filename safe Base64 Alphabet' specified in Table 2 of RFC 4648,
without padding; thus each UUID is a 22-character string.

Data IRIs are not currently intended to be dereferenced as URLs. Instead, each Knora resource will have a corresponding ARK_ URL, which will be handled by a server that redirects requests to the relevant Knora API server (see :ref:`permalinks`). However, every generated data IRI begins with ``http://rdfh.ch``. This domain is not curently used, but it is owned by the DaSCH_, so it would be possible to make resource IRIs directly dereferenceable in the future.
Data IRIs are not currently intended to be dereferenced as URLs. Instead, each Knora resource will
have a corresponding ARK_ URL, which will be handled by a server that redirects requests to the
relevant Knora API server (see :ref:`permalinks`). However, every generated data IRI begins with
``http://rdfh.ch``. This domain is not curently used, but it is owned by the DaSCH_, so it would be
possible to make resource IRIs directly dereferenceable in the future.

The formats of generated data IRIs for different types of objects are as follows:

- Project: ``http://rdfh.ch/projects/PROJECT_UUID``
- Resource: ``http://rdfh.ch/PROJECT_CODE/RESOURCE_UUID``. The current implementation actually uses the project shortname, but it will be changed to use the project code (`issue #654 <https://github.com/dhlab-basel/Knora/issues/654>`_).
- Resource: ``http://rdfh.ch/PROJECT_CODE/RESOURCE_UUID``. The current implementation actually uses
the project shortname, but it will be changed to use the project code
(`issue #654 <https://github.com/dhlab-basel/Knora/issues/654>`_).
- Value: ``http://rdfh.ch/PROJECT_CODE/RESOURCE_UUID/values/VALUE_UUID``
- Standoff tag: ``http://rdfh.ch/PROJECT_CODE/RESOURCE_UUID/values/VALUE_UUID/STANDOFF_UUID``
- Group: ``http://rdfh.ch/groups/GROUP_UUID``
Expand Down
Loading

0 comments on commit cc53ef8

Please sign in to comment.