Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to general data cube definition #382

Merged
merged 15 commits into from
Jan 30, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,22 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Changes for vector cubes

- Update the processes based on `raster-cubes` to work with `datacube` instead
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- Renamed `create_raster_cube` to `create_data_cube`
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- `add_dimension`: Added new dimension type `vector`
- New definition for `aggregate_spatial`:
- Allows more than 3 input dimensions
- Allow to not export statistics by changing the parameter `target_dimenaion`
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- Clarified how the resulting vector cube looks like
m-mohr marked this conversation as resolved.
Show resolved Hide resolved

## Unreleased / Draft

### Added

- New processes in proposal state:
- `filter_vector`
- `fit_class_random_forest`
- `fit_regr_random_forest`
- `flatten_dimensions`
Expand All @@ -29,8 +40,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations.
- `apply_neighborhood`: Allow `null` as default value for units.
- `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376)
- `load_collection` and `load_result`: Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372)
- `load_collection`: Added a `NoDataAvailable` exception
- `load_collection` and `load_result`:
- Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372)
- Added a `NoDataAvailable` exception
- `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369)
- `save_result`: Added a more concrete `DataCubeEmpty` exception.

Expand Down
9 changes: 5 additions & 4 deletions add_dimension.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"description": "A data cube to add the dimension to.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube"
}
},
{
Expand Down Expand Up @@ -39,9 +39,10 @@
"schema": {
"type": "string",
"enum": [
"bands",
"spatial",
"temporal",
"bands",
"vector",
"other"
]
},
Expand All @@ -53,12 +54,12 @@
"description": "The data cube with a newly added dimension. The new dimension has exactly one dimension label. All other dimensions remain unchanged.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube"
}
},
"exceptions": {
"DimensionExists": {
"message": "A dimension with the specified name already exists."
}
}
}
}
57 changes: 41 additions & 16 deletions aggregate_spatial.json
Original file line number Diff line number Diff line change
@@ -1,27 +1,47 @@
{
"id": "aggregate_spatial",
"summary": "Zonal statistics for geometries",
"description": "Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions. The number of total and valid pixels is returned together with the calculated values.\n\nAn 'unbounded' aggregation over the full extent of the horizontal spatial dimensions can be computed with the process ``reduce_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.",
"description": "Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions. The given data cube can have multiple additional dimension and for all these dimensions results will be computed individually.\n\nAn 'unbounded' aggregation over the full extent of the horizontal spatial dimensions can be computed with the process ``reduce_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.",
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
"categories": [
"cubes",
"aggregate & resample"
],
"parameters": [
{
"name": "data",
"description": "A raster data cube.\n\nThe data cube must have been reduced to only contain two spatial dimensions and a third dimension the values are aggregated for, for example the temporal dimension to get a time series. Otherwise, this process fails with the `TooManyDimensions` exception.\n\nThe data cube implicitly gets restricted to the bounds of the geometries as if ``filter_spatial()`` would have been used with the same values for the corresponding parameters immediately before this process.",
"description": "A raster data cube with at least two spatial dimensions.\n\nThe data cube implicitly gets restricted to the bounds of the geometries as if ``filter_spatial()`` would have been used with the same values for the corresponding parameters immediately before this process.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "spatial",
"axis": [
"x",
"y"
]
}
]
}
},
{
"name": "geometries",
"description": "Geometries as GeoJSON on which the aggregation will be based. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per GeoJSON `Feature`, `Geometry` or `GeometryCollection`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).",
"schema": {
"type": "object",
"subtype": "geojson"
}
"description": "Geometries for which the aggregation will be computed. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per label in the dimension of type `vector`, GeoJSON `Feature`, `Geometry` or `GeometryCollection`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).",
"schema": [
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
{
"type": "object",
"subtype": "geojson"
},
{
"type": "object",
"subtype": "datacube",
"dimensions": [
{
"type": "vector"
}
]
}
]
},
{
"name": "reducer",
Expand Down Expand Up @@ -60,11 +80,14 @@
},
{
"name": "target_dimension",
"description": "The name of a new dimensions that is used to store the results. A new dimension will be created with the given name and type `other` (see ``add_dimension()``). Defaults to the dimension name `result`. Fails with a `TargetDimensionExists` exception if a dimension with the specified name exists.",
"description": "By default (which is `null`), the process only computes the results and doesn't add a new dimension. If this parameter contains a new dimension name, the computation also stores information about the total count of pixels (valid + invalid pixels) and the number of valid pixels (see ``is_valid()``) for each computed value. These values are added as a new dimension. The new dimension of type `other` has the dimension labels `value`, `total_count` and `valid_count`.",
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
"schema": {
"type": "string"
"type": [
"string",
"null"
]
},
"default": "result",
"default": null,
"optional": true
},
{
Expand All @@ -78,16 +101,18 @@
}
],
"returns": {
"description": "A vector data cube with the computed results and restricted to the bounds of the geometries.\n\nThe computed value is used for the dimension with the name that was specified in the parameter `target_dimension`.\n\nThe computation also stores information about the total count of pixels (valid + invalid pixels) and the number of valid pixels (see ``is_valid()``) for each geometry. These values are added as a new dimension with a dimension name derived from `target_dimension` by adding the suffix `_meta`. The new dimension has the dimension labels `total_count` and `valid_count`.",
"description": "A vector data cube with the computed results and restricted to the bounds of the geometries. The spatial dimensions is replaced by a vector dimension and if `target_dimension` is not `null`, a new dimension is added.",
"schema": {
"type": "object",
"subtype": "vector-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "vector"
}
]
}
},
"exceptions": {
"TooManyDimensions": {
"message": "The number of dimensions must be reduced to three for `aggregate_spatial`."
},
"TargetDimensionExists": {
"message": "A dimension with the specified target dimension name already exists."
}
Expand Down
16 changes: 13 additions & 3 deletions aggregate_temporal.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,12 @@
"description": "A data cube.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
{
Expand Down Expand Up @@ -162,7 +167,12 @@
"description": "A new data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the given temporal dimension.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
"examples": [
Expand Down Expand Up @@ -234,4 +244,4 @@
"title": "Aggregation explained in the openEO documentation"
}
]
}
}
16 changes: 13 additions & 3 deletions aggregate_temporal_period.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,12 @@
"description": "The source data cube.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
{
Expand Down Expand Up @@ -97,7 +102,12 @@
"description": "A new data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the given temporal dimension. The specified temporal dimension has the following dimension labels (`YYYY` = four-digit year, `MM` = two-digit month, `DD` two-digit day of month):\n\n* `hour`: `YYYY-MM-DD-00` - `YYYY-MM-DD-23`\n* `day`: `YYYY-001` - `YYYY-365`\n* `week`: `YYYY-01` - `YYYY-52`\n* `dekad`: `YYYY-00` - `YYYY-36`\n* `month`: `YYYY-01` - `YYYY-12`\n* `season`: `YYYY-djf` (December - February), `YYYY-mam` (March - May), `YYYY-jja` (June - August), `YYYY-son` (September - November).\n* `tropical-season`: `YYYY-ndjfma` (November - April), `YYYY-mjjaso` (May - October).\n* `year`: `YYYY`\n* `decade`: `YYY0`\n* `decade-ad`: `YYY1`\n\nThe dimension labels in the new data cube are complete for the whole extent of the source data cube. For example, if `period` is set to `day` and the source data cube has two dimension labels at the beginning of the year (`2020-01-01`) and the end of a year (`2020-12-31`), the process returns a data cube with 365 dimension labels (`2020-001`, `2020-002`, ..., `2020-365`). In contrast, if `period` is set to `day` and the source data cube has just one dimension label `2020-01-05`, the process returns a data cube with just a single dimension label (`2020-005`).",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
"exceptions": {
Expand All @@ -118,4 +128,4 @@
"title": "Aggregation explained in the openEO documentation"
}
]
}
}
21 changes: 18 additions & 3 deletions anomaly.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,25 @@
"description": "A data cube with exactly one temporal dimension and the following dimension labels for the given period (`YYYY` = four-digit year, `MM` = two-digit month, `DD` two-digit day of month):\n\n* `hour`: `YYYY-MM-DD-00` - `YYYY-MM-DD-23`\n* `day`: `YYYY-001` - `YYYY-365`\n* `week`: `YYYY-01` - `YYYY-52`\n* `dekad`: `YYYY-00` - `YYYY-36`\n* `month`: `YYYY-01` - `YYYY-12`\n* `season`: `YYYY-djf` (December - February), `YYYY-mam` (March - May), `YYYY-jja` (June - August), `YYYY-son` (September - November).\n* `tropical-season`: `YYYY-ndjfma` (November - April), `YYYY-mjjaso` (May - October).\n* `year`: `YYYY`\n* `decade`: `YYY0`\n* `decade-ad`: `YYY1`\n* `single-period` / `climatology-period`: Any\n\n``aggregate_temporal_period()`` can compute such a data cube.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
{
"name": "normals",
"description": "A data cube with normals, e.g. daily, monthly or yearly values computed from a process such as ``climatological_normal()``. Must contain exactly one temporal dimension with the following dimension labels for the given period:\n\n* `hour`: `00` - `23`\n* `day`: `001` - `365`\n* `week`: `01` - `52`\n* `dekad`: `00` - `36`\n* `month`: `01` - `12`\n* `season`: `djf` (December - February), `mam` (March - May), `jja` (June - August), `son` (September - November)\n* `tropical-season`: `ndjfma` (November - April), `mjjaso` (May - October)\n* `year`: Four-digit year numbers\n* `decade`: Four-digit year numbers, the last digit being a `0`\n* `decade-ad`: Four-digit year numbers, the last digit being a `1`\n* `single-period` / `climatology-period`: A single dimension label with any name is expected.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
},
{
Expand Down Expand Up @@ -50,7 +60,12 @@
"description": "A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube",
"dimensions": [
{
"type": "temporal"
}
]
}
}
}
8 changes: 4 additions & 4 deletions apply.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"id": "apply",
"summary": "Apply a process to each pixel",
"description": "Applies a process to each pixel value in the data cube (i.e. a local operation). In contrast, the process ``apply_dimension()`` applies a process to all pixel values along a particular dimension.",
"summary": "Apply a process to each value",
"description": "Applies a process to each value in the data cube (i.e. a local operation). In contrast, the process ``apply_dimension()`` applies a process to all values along a particular dimension.",
"categories": [
"cubes"
],
Expand All @@ -11,7 +11,7 @@
"description": "A data cube.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube"
}
},
{
Expand Down Expand Up @@ -60,7 +60,7 @@
"description": "A data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.",
"schema": {
"type": "object",
"subtype": "raster-cube"
"subtype": "datacube"
}
},
"links": [
Expand Down
Loading