Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ranged properties #452

Closed
Changes from 17 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c18cf0c
Intermediate state adding ranged properties.
JPBergsma Dec 31, 2022
7dc50f2
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Jan 3, 2023
6afcf9c
first draft for ranged properties.
JPBergsma Jan 6, 2023
1a55c2a
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Jan 6, 2023
b597a67
Removed average, set, min and max fields for now as these become quit…
JPBergsma Jan 9, 2023
37db878
Small corrections.
JPBergsma Jan 9, 2023
a832751
Added how to treat missing values for requested range.
JPBergsma Jan 11, 2023
16aba07
changed description field
JPBergsma Jan 12, 2023
b1d69a8
Apply suggestions from code review
JPBergsma Jan 12, 2023
edbfc25
Merge branch 'Materials-Consortia:develop' into JPBergsma/Add_ranged_…
JPBergsma Jan 12, 2023
14de45d
Apply suggestions from code review Vaitkus
JPBergsma Jan 17, 2023
906db81
intermediate state from implementing code review.
JPBergsma Jan 17, 2023
1feb4a9
Merge branch 'JPBergsma/Add_ranged_properties' of https://github.com/…
JPBergsma Jan 17, 2023
a96dffe
Processed comments rartino.
JPBergsma Jan 18, 2023
15f599c
Small corrections.
JPBergsma Feb 15, 2023
73905dc
Added returned range property.
JPBergsma Feb 15, 2023
c6834f3
Added extra explanation values field.
JPBergsma Feb 16, 2023
b0cc94c
Apply suggestions from code review
JPBergsma Feb 17, 2023
0cee1e6
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Mar 6, 2023
d7c8a9c
Processed comments rickard and a few more small improvements.
JPBergsma Mar 9, 2023
d1e8d74
Further changes after proof reading.
JPBergsma Mar 9, 2023
139c70e
further refinements.
JPBergsma Mar 9, 2023
916d6f2
Changed 'n_' to 'n' for ranged metadata properties tio be consistent …
JPBergsma Mar 13, 2023
ee3651e
Changed wording of range_id field after suggestion Rickard.
JPBergsma Mar 13, 2023
1f794b6
placed subsequent sentences on seperate lines.
JPBergsma Mar 16, 2023
169a1f4
Processed points discussed with Rickard.
JPBergsma Mar 24, 2023
f513596
Added per entry next field + small corrections.
JPBergsma Mar 28, 2023
65b8ad1
Apply suggestions from code review
JPBergsma May 2, 2023
94db38c
Adjusted the description next fields for ranged properties.
JPBergsma May 2, 2023
033ea11
Corrected range name for _exmpl_ranged_thermostat.
JPBergsma May 25, 2023
3762d30
Added that querying on properties in the range dictionary is optional.
JPBergsma May 30, 2023
8332567
Specifically mention that support for queries directly on the values …
JPBergsma May 31, 2023
a87b301
Updated example to latest version metadata proposal and added more ex…
JPBergsma Jun 2, 2023
7e9b4f4
Improved explanation returned_range field.
JPBergsma Jun 2, 2023
4b298c5
Small corrections.
JPBergsma Jun 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 174 additions & 0 deletions optimade.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ OPTIMADE API specification v1.2.0~develop

entry : names of type of resources, served via OPTIMADE, pertaining to data in a database.
property : data item that belongs to an entry.
ranged_property : A property that can be returned in pieces and that supports slicing.
val : value examples that properties can be.
:val: is ONLY used when referencing values of actual properties, i.e., information that belongs to the database.
type : data type of values.
Expand Down Expand Up @@ -67,6 +68,8 @@ OPTIMADE API specification v1.2.0~develop

.. role:: property(literal)

.. role:: ranged-property(literal)

.. role:: val(literal)

.. role:: type(literal)
Expand Down Expand Up @@ -442,6 +445,106 @@ For example, the following query can be sent to API implementations `exmpl1` and

:filter:`filter=_exmpl1_band_gap<2.0 OR _exmpl2_band_gap<2.5`


Ranged Properties
-----------------

Ranged properties support slicing, so the client can request that only some of the values need to be returned.
Likewise, the server can also use slicing to reduce the size of the response and return the property in multiple parts.
This can be useful for entries/properties that are so large that it can be inconvenient to return them in a single response.
If an entry is too large to be returned in a single response a link is provided, as described in `JSON Response Schema: Common Fields`_ under the `links.next` field, from which the remainder of the requested data can be retrieved.
Ranged properties in the same entry can have separate independent ranges, or the same range so they are "correlated" e.g. if an energy and a set of particle positions have the same index this energy belongs to those particle positions.
Only the metadata is returned by default, the data is only returned when specifically requested via the :query-param:`ranged property` query parameter as described under `Entry Listing URL Query Parameters`_.

- **Requirements/Conventions**:

- **Support**: OPTIONAL support in implementations.
- Ranged properties can be identified by the prefix "_ranged_". If it is a database specific field, the prefix of the database comes first.
- If the part of the property name after the "_ranged_" prefix matches the name of a OPTIMADE field for the entry point, the values in the list of the :property:`values` MUST follow the rules of this property.
- By default, only the metadata SHOULD be returned (i.e. all the fields except :property:`values` and :property:`indexes`).
The :property:`values` and :property:`indexes` fields SHOULD only be returned when requested via the :query-param:`ranged property` as described under `Entry Listing URL Query Parameters`_.

- **Query**: Queries on the dictionary fields SHOULD be supported, except for the :property:`values` and :property:`indexes` fields for which querying is OPTIONAL.

The dictionary MUST include these fields:

- :field:`serialization_format`: string.
To improve the compactness of the data there are several ways to show to which index a value belongs.
The string MUST take one of the following values:

- `linear`: The value is a linear function of the indexes.
This function is defined by :property:`offset_linear` and :property:`step_size_linear`.
- `regular`: The value is set for one out of every :property:`step_size_sparse` indexes, with :property:`offset_sparse` indicating the index of the first value.
- `custom`: A separate list with indexes is defined in the field :property:`indexes` to indicate to which index each value belongs.

- :field:`n_dim`: integer.
The number of dimensions this property has.

- :field:`dim_size`: list of integers.
The dimensions of the range in each dimension.


Depending on the value of the :property:`serialization_format`, the following fields MUST be present or SHOULD NOT be present in the dictionary:

- :field:`nvalues`: integer.
The number of values in the field :property:`values`.
This field MUST be present when :property:`serialization_format` is not set to :val:`"linear"` else it SHOULD NOT be present.

- :field:`step_size_linear`:list of floats.
If :property:`serialization_format` is set to :val:`"linear"`, this value gives the change in the value of the property per step along each of the dimensions of the range.
For example, if the value :property:`offset_linear` = 0.5 and the value of :property:`step_size_linear` = [0.2,0.3] than at index[3,4] the value of the property will be 1.8.
The value MUST be present when :property:`serialization_format` is set to "linear".
Otherwise, it SHOULD NOT be present.

- :field:`step_size_regular`: list of integers.
If :property:`serialization_format` is set to :val:`"regular"`, this value indicates that a value is defined one out of every :property:`step_size_regular` steps in each dimension.
The value MUST be present when :property:`serialization_format` is set to :val:`"regular"`.
Otherwise, it SHOULD NOT be present.


Depending on the value of the :property:`serialization_format`, the following fields MAY be present or SHOULD NOT be present in the dictionary:

- :field:`offset_linear`: float.
If :property:`serialization_format` is set to :val:`"linear"` this property gives the value at the origin, i.e. where the index in all dimensions is 1.
The value MAY be present when :property:`serialization_format` is set to :val:`"linear"`, otherwise the value SHOULD NOT be present.
The default value is 0.

- :field:`offset_regular`: list of integers.
If :property:`serialization_format` is set to :val:`"regular"` this property gives the indexes of the first value.
The value MAY be present when :property:`serialization_format` is set to :val:`"regular"`, otherwise the value SHOULD NOT be present.
The default value is 1 in every dimension.


The dictionary MAY include these fields:

- :field:`range_ids`: list of strings.
A list with an identifier for each dimension of the range. It shows that that dimension correlates to the same dimension of another range.
For example, when data of an MD trajectory is shared, it could be used to indicate that the energies and the cartesian_site_positions of the index in a certain dimension are correlated.
i.e. which energy belongs to which set of cartesian_site_positions.


If a ranged property has been included in the query parameter :query-param:`property_ranges`, the following properties MUST be present or SHOULD NOT be present, depending on the value of the :property:`serialization_format`.

- :field:`values`: List of Any
The values belonging to this property.
The range for which these values are returned is specified by the :query-param:`property_ranges` as described in `Entry Listing URL Query Parameters`_.
The format of this field depends on the property for which data is stored.
The property :property:`values` MUST be present when :property:`serialization_format` is not set to :val:`"linear"`.

- :field:`indexes`: List of lists of integers
If :property:`serialization_format` is set to :val:`"custom"`, this field holds the indexes to which the values in the value field belong.
The value MUST be present when :property:`serialization_format` is set to "custom".
Otherwise, it SHOULD NOT be present. The order of the values must be the same as those in :property:`values`.

- :field:`returned_range`: List of list of integers
The range belonging to the returned data. It uses the same format as the property_ranges query param.
It consist of a list which for each dimension contains a list of three values.
The first value indicates the index, in that dimension, of the first value that has been returned.
The second value indicates the index of last returned value.
The third value is the step size.
It is only returned when the `serialization_format` is not "linear".


Responses
=========

Expand Down Expand Up @@ -697,6 +800,57 @@ An example of a full response:
]
}


- Several examples of how ranged properties can be returned in the JSON format:

.. code:: jsonc

{
"_ranged_cartesian_site_positions": {
"n_dim": 3,
"dim_size": [100, 3, 3],
"range_ids": ["mdsteps","particles","xyz"],
"serialization_format": "regular",
"offset_regular": [1, 1, 1],
"step_size_regular": [1, 1, 1],
"nvalues": 900,
"values": [[[2.36, 5.36, 9.56],[7.24, 3.58, 0.56],[8.12, 6.95, 4.56]],
[[2.38, 5.37, 9.56],[7.24, 3.57, 0.58],[8.11, 6.93, 4.58]],
[[2.39, 5.38, 9.55],[7.23, 3.57, 0.59],[8.10, 6.93, 4.57]]
// ...
]
"returned_range": [[1,100,2],[1,3,1],[1,3,1]]
},
"_ranged_species_at_sites": {
"n_dim": 1,
"dim_size": [3],
"range_ids": ["particles"],
"serialization_format": "regular",
"offset_regular": [0],
"step_size_regular": [1],
"nvalues": 3,
"values": ["He", "Ne", "Ar"]
"returned_range":[[1,3,1]]
},
"_exmpl_ranged_time":{
"n_dim": 1,
"dim_size": [100],
"range_ids": ["mdsteps"],
"serialization_format": "linear",
"step_size_linear": 0.2
},
"_exmpl_ranged_thermostat": {
"n_dim": 1,
"dim_size": [100],
"range_ids": ["mdsteps"],
"serialization_format": "custom",
"nvalues": 3,
"values": [20, 40, 60],
"indexes": [[0], [20], [80]]
}
}


HTTP Response Status Codes
--------------------------

Expand Down Expand Up @@ -880,6 +1034,26 @@ Standard OPTIONAL URL query parameters not in the JSON API specification:
If provided, these fields MUST be returned along with the REQUIRED fields.
Other OPTIONAL fields MUST NOT be returned when this parameter is present.
Example: :query-url:`http://example.com/optimade/v1/structures?response_fields=last_modified,nsites`
- **property\_ranges**: specifies which data ranges should be returned for ranged properties.
It MUST be supported by databases having ranged properties.
It consists of a property name directly followed by the range that should be returned.
A range is a list containing a list for each dimension.
Each dimensions list has three integer values.
The first value of the range specifies the first index in that dimension for which values should be returned.
The second value specifies the last index for which values should be returned.
The third value specifies the step size.
A list consists of a pair of square brackets ("[", ASCII 91(0x5B)) and ("]", ASCII 93(0x5D)) enclosing a number of values separated by a comma (",", ASCII 91(0x5B))
Ranges can be specified for multiple properties by separating them with a comma.
Databases MUST return the :property:`values` and :property:`indexes` field belonging to properties listed and SHOULD use the ranges in this query parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some reformulation for clarity:

Suggested change
Databases MUST return the :property:`values` and :property:`indexes` field belonging to properties listed and SHOULD use the ranges in this query parameter.
When a client includes `property_ranges` in a request, the response MUST include the :property:`values` and :property:`indexes` field belonging to properties listed and SHOULD use the ranges in this query parameter.

but the SHOULD here makes me nervous. I think supporting this parameter should be optional, but if you support it, don't you have to follow the indexing request?

For properties with :property:`serialization_format` :val:`custom` indexes that fall in the requested range but for which there is no value defined should not be returned.
For properties with :property:`serialization_format` :val:`regular` indexes that fall in the requested range but for which there is no value defined should have the value :val:`null`.
The ranges are 1 based, i.e. the first value has index 1, and inclusive i.e. for the range :val:`[10,20,1]` the last value returned belongs to index 20.
Example:

If there would be a structure with id: id_12345 and a property :ranged-property:`_ranged_test_field` with the values :val:`[[9.64, 7.52, 0.69, 5.69], [4.82, 8.35, 3.26, 3.25], [4.82, 2.78, 7.87, 7.42], [5.49, 3.48, 1.65, 0.75]]` the query: :query-url:`http://example.com/optimade/v1/structures/id_12345?property_ranges=_ranged_test_field[[1, 3, 2], [2, 3, 1]]`
will return the value: :val:`[[7.52, 0.69], [2.78, 7.87]]`
Multiple ranges can be requested in one query. e.g. :query-param:`property_ranges=_ranged_test_field[[1, 3, 2], [2, 3, 1]], _ranged_other_field[[1,100,1]]`


Additional OPTIONAL URL query parameters not described above are not considered to be part of this standard, and are instead considered to be "custom URL query parameters".
These custom URL query parameters MUST be of the format "<database-provider-specific prefix><url\_query\_parameter\_name>".
Expand Down