Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset improvements #262

Merged
merged 4 commits into from
Oct 7, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 4 additions & 15 deletions catalog-spec/catalog-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,6 @@ incorporated.

## Catalog Definitions

There are two required element types of a Catalog: Catalog and Item. A STAC Catalog
points to [STAC Items](../item-spec/), or to other STAC catalogs. The top-most parent catalog is
called the "root" catalog. The root catalog generally defines information about the catalog as a
whole, such as name, description, licensing, contact information and so forth. However, it is
strongly recommended that a "root" catalog define metadata fields that apply to the entire `catalog`
(such that child catalogs and items simply inherit these field values). Catalogs below the root
generally have less information and serve to create a directory structure for categorizing and
grouping item data. The contents of a catalog are flexible and STAC makes no assumptions for where
or how catalog metadata is defined within a catalog. For example, a non-root catalog could redefine
or add different licensing or copyright terms.

STAC makes no formal distinction between a "root" catalog and the "child" catalogs. A root catalog
is simply a top-most `catalog` (which has no parent). A nested `catalog` structure is useful (and
recommended) for breaking up massive numbers of catalog items into logical groupings. For example,
Expand Down Expand Up @@ -84,7 +73,8 @@ type files. In order to support multiple "root" catalogs, the recommended practi

| Element | Type | Description |
| ----------- | ------------- | ------------------------------------------------------------ |
| name | string | **REQUIRED.** Name for the catalog. |
| id | string | **REQUIRED.** Identifier for the catalog. |
| title | string | A short descriptive one-line title for the catalog. |
| description | string | **REQUIRED.** Detailed multi-line description to fully explain the catalog. [CommonMark 0.28](http://commonmark.org/) syntax MAY be used for rich text representation. |
| links | [Link Object] | **REQUIRED.** A list of references to other documents. |

Expand All @@ -96,7 +86,7 @@ might look something like this:

```json
{
"name": "NAIP",
"id": "NAIP",
"description": "Catalog of NAIP Imagery",
"links": [
{ "rel": "self", "href": "https://www.fsa.usda.gov/naip/catalog.json" },
Expand All @@ -113,7 +103,7 @@ A typical '_child_' catalog could look similar:

```json
{
"name": "NAIP",
"id": "NAIP",
"description": "Catalog of NAIP Imagery - 30087",
"links": [
{ "rel": "self", "href": "https://www.fsa.usda.gov/naip/30087/catalog.json" },
Expand Down Expand Up @@ -149,6 +139,5 @@ The following types are commonly used as `rel` types in the Link Object of a Dat
| parent | URL to the parent [STAC Catalog](../catalog-spec/). Non-root catalogs should include a link to their parent. |
| child | URL to a child [STAC Catalog](../catalog-spec/). |
| item | URL to a [STAC Item](../item-spec/). |
| license | The license URL for the catalog SHOULD be specified if the `license` field is set to `proprietary`. If there is no public license URL available, it is RECOMMENDED to supplement the STAC catalog with the license text in a separate file and link to this file. |

**Note:** A link to at least one `item` or `child` catalog is _required_.
3 changes: 2 additions & 1 deletion catalog-spec/examples/catalog.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"name": "Sample catalog",
"id": "sample",
"title": "Sample catalog",
"description": "This is a very basic sample catalog.",
"links": [
{
Expand Down
10 changes: 7 additions & 3 deletions catalog-spec/json-schema/catalog.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,17 @@
"title": "Catalog",
"type": "object",
"required": [
"name",
"id",
"description",
"links"
],
"properties": {
"name": {
"title": "Name",
"id": {
"title": "Identifier",
"type": "string"
},
"title": {
"title": "Title",
"type": "string"
},
"description": {
Expand Down
32 changes: 11 additions & 21 deletions dataset-spec/dataset-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@ Implementations are encouraged, however, as good effort will be made to not chan

| Element | Type | Description |
| ----------- | ----------------- | ------------------------------------------------------------ |
| name | string | **REQUIRED.** Identifier for the dataset that is unique across the provider. |
| id | string | **REQUIRED.** Identifier for the dataset that is unique across the provider. |
| title | string | A short descriptive one-line title for the dataset. |
| description | string | **REQUIRED.** Detailed multi-line description to fully explain the entity. [CommonMark 0.28](http://commonmark.org/) syntax MAY be used for rich text representation. |
| keywords | [string] | List of keywords describing the dataset. |
| version | string | Version of the dataset. [Semantic Versioning (SemVer)](https://semver.org/) SHOULD be followed. |
| version | string | Version of the dataset. |
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
| license | string | **REQUIRED.** Dataset's license(s) as a SPDX [License identifier](https://spdx.org/licenses/) or [expression](https://spdx.org/spdx-specification-21-web-version#h.jxpfx0ykyb60) or `proprietary` if the license is not on the SPDX license list. Proprietary licensed data SHOULD add a link to the license text, see the `license` relation type. |
| provider | [Provider Object] | A list of data providers, the organizations which influenced the content of the dataset. Providers should be listed in chronological order with the most recent provider being the last element of the list. |
| host | Host Object | Storage provider, the organization that hosts the dataset. |
| provider | [Provider Object] | A list of providers, which may include all organizations capturing or processing the data or the hosting provider. Providers should be listed in chronological order with the most recent provider being the last element of the list. |
| extent | [Extent Object] | **REQUIRED.** Spatial and temporal extents. |
| links | [Link Object] | **REQUIRED.** A list of references to other documents. |

Expand Down Expand Up @@ -52,27 +51,19 @@ The coordinate reference system of the values is WGS84 longitude/latitude.

### Provider Object

The object provides information about a provider. A provider is any of the organizations that created or processed the content of the dataset and therefore influenced the data offered by this dataset.
The object provides information about a provider. A provider is any of the organizations that captured or processed the content of the dataset and therefore influenced the data offered by this dataset. May also include information about the final storage provider hosting the data.

| Field Name | Type | Description |
| ---------- | ------ | ------------------------------------------------------------ |
| name | string | **REQUIRED.** The name of the organization or the individual. |
| type | string | The type of provider. Any of `producer`, `processor` or `host`. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also probably explain a bit more on what 'producer', 'processor' and 'host' mean. Doesn't need to be in this table, could be just below or above. But just a bit of explanation to help people make a decision about which to use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll add explanation about it next week.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the explanation. Please check whether it needs more explanation or not.

| url | string | Homepage of the provider. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we make a recommendation here that the homepage should ideally include a way to find a contact about the dataset? To help explain a bit why we dropped 'contact' info.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add that next week.


### Host Object
**type**: The type of the provider can be one of the following elements:

The objects provides information about the storage provider hosting the data.

**Note:** The idea of storage profiles is currently [discussed](https://github.com/radiantearth/stac-spec/issues/148). Therefore, scheme, id and region may be removed from the final spec once this concept is introduced to STAC.

| Field Name | Type | Description |
| -------------- | ------- | ------------------------------------------------------------ |
| name | string | **REQUIRED.** The name of the organization or the individual hosting the data. |
| description | string | Detailed description to explain the hosting details. [CommonMark 0.28](http://commonmark.org/) syntax MAY be used for rich text representation. |
| scheme | string | **REQUIRED.** The protocol/scheme used to access the data. Any of: `S3`, `GCS`, `URL`, `OTHER` |
| id | string | **REQUIRED.** Host-specific identifier such as an URL or asset id. |
| region | string | Provider specific region where the data is stored. |
| requester_pays | boolean | `true` if requester pays, `false` if host pays. Defaults to `false`. |
* *producer*: The producer of the data is the provider that initially captured and processed the source data, e.g. ESA for Sentinel-2 data.
* *processor*: A processor is any provider who processed data to a derived product.
* *host*: The host is the actual provider offering the data on their storage. There should be no more than one host, specified as last element of the list.

### Link Object

Expand All @@ -94,12 +85,11 @@ The following types are commonly used as `rel` types in the Link Object of a Dat
| root | URL to the root [STAC Catalog](../catalog-spec/) or Dataset. |
| parent | URL to the parent [STAC Catalog](../catalog-spec/) or Dataset. |
| child | URL to a child [STAC Catalog](../catalog-spec/) or Dataset. |
| item | URL to a [STAC Item](../item-spec/). |
| item | URL to a [STAC Item](../item-spec/). All items linked from a dataset MUST refer back to its dataset with the `dataset` relation type. |
| license | The license URL for the dataset SHOULD be specified if the `license` field is set to `proprietary`. If there is no public license URL available, it is RECOMMENDED to supplement the STAC catalog with the license text in a separate file and link to this file. |
| derived_from | URL to a STAC `Dataset` that was used as input data in the creation of this `Dataset`. See the note in [STAC Item](../item-spec/item-spec.md) for more info. |


**Note:** The [catalog specification](../catalog-spec/catalog-spec.md) requires a link to at least one `item` or `child` catalog. This is _not_ a requirement for datasets, but _recommended_.
**Note:** The [catalog specification](../catalog-spec/catalog-spec.md) requires a link to at least one `item` or `child` catalog. This is _not_ a requirement for datasets, but _recommended_. In contrast to catalogs, it is **required** that items linked from a dataset MUST refer back to its dataset with the `dataset` relation type.

## Extensions

Expand Down
3 changes: 2 additions & 1 deletion dataset-spec/examples/sentinel2.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"name": "COPERNICUS/S2",
"id": "COPERNICUS/S2",
"title": "Sentinel-2 MSI: MultiSpectral Instrument, Level-1C",
"description": "Sentinel-2 is a wide-swath, high-resolution, multi-spectral\nimaging mission supporting Copernicus Land Monitoring studies,\nincluding the monitoring of vegetation, soil and water cover,\nas well as observation of inland waterways and coastal areas.\n\nThe Sentinel-2 data contain 13 UINT16 spectral bands representing\nTOA reflectance scaled by 10000. See the [Sentinel-2 User Handbook](https://sentinel.esa.int/documents/247904/685211/Sentinel-2_User_Handbook)\nfor details. In addition, three QA bands are present where one\n(QA60) is a bitmask band with cloud mask information. For more\ndetails, [see the full explanation of how cloud masks are computed.](https://sentinel.esa.int/web/sentinel/technical-guides/sentinel-2-msi/level-1c/cloud-masks)\n\nEach Sentinel-2 product (zip archive) may contain multiple\ngranules. Each granule becomes a separate Earth Engine asset.\nEE asset ids for Sentinel-2 assets have the following format:\nCOPERNICUS/S2/20151128T002653_20151128T102149_T56MNN. Here the\nfirst numeric part represents the sensing date and time, the\nsecond numeric part represents the product generation date and\ntime, and the final 6-character string is a unique granule identifier\nindicating its UTM grid reference (see [MGRS](https://en.wikipedia.org/wiki/Military_Grid_Reference_System)).\n\nFor more details on Sentinel-2 radiometric resoltuon, [see this page](https://earth.esa.int/web/sentinel/user-guides/sentinel-2-msi/resolutions/radiometric).\n",
"license": "proprietary",
Expand All @@ -14,6 +14,7 @@
"provider": [
{
"name": "European Union/ESA/Copernicus",
"type": "producer",
"url": "https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi"
}
],
Expand Down
56 changes: 12 additions & 44 deletions dataset-spec/json-schema/dataset.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@
"description": "This object represents the dataset in a SpatioTemporal Asset Catalog.",
"type": "object",
"required": [
"name",
"id",
"description",
"license",
"extent",
"links"
],
"additionalProperties": true,
"properties": {
"name": {
"id": {
"title": "Identifier",
"type": "string"
},
Expand Down Expand Up @@ -45,9 +45,18 @@
"items": {
"properties": {
"name": {
"title": "Organization Name",
"title": "Organization name",
"type": "string"
},
"type": {
"title": "Organization type",
"type": "string",
"enum": [
"producer",
"processor",
"host"
]
},
"url": {
"title": "Organization homepage",
"type": "string",
Expand All @@ -56,47 +65,6 @@
}
}
},
"host": {
"required": [
"name",
"scheme",
"id"
],
"properties": {
"name": {
"title": "Organization name",
"type": "string"
},
"description": {
"title": "Description",
"type": "string"
},
"scheme": {
"title": "Scheme",
"type": "string",
"enum": [
"S3",
"GCS",
"URL",
"OTHER"
]
},
"id": {
"title": "Identifirer",
"type": "string"
},
"region": {
"title": "Region",
"type": "string"
},
"requester_pays": {
"title": "Requester Pays",
"type": "boolean",
"default": false
}
},
"additionalProperties": true
},
"extent": {
"title": "Extents",
"type": "object",
Expand Down
3 changes: 1 addition & 2 deletions extensions/scientific/example-merraclim.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
{
"name": "MERRAclim",
"id": "MERRAclim",
"description": "MERRAclim, a high-resolution global dataset of remotely sensed bioclimatic variables for ecological modelling.",
"keywords": [
"bioclimatic",
"MERRAclim",
"macroecology",
"biogeography"
],
"homepage": "https://datadryad.org/resource/doi:10.5061/dryad.s2v81",
"links": [
{
"rel": "self",
Expand Down
6 changes: 2 additions & 4 deletions item-spec/examples/CBERS_4_MUX_20170528_090_084_L2.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,6 @@
},
"properties": {
"datetime": "2017-05-28T09:01:17Z",
"provider": "INPE",
"eo:collection": "default",
"eo:sun_azimuth": 66.2923,
"eo:sun_elevation": 70.3079,
"eo:off_nadir": -0.00744884,
Expand All @@ -54,11 +52,11 @@
"href": "https://cbers-stac.s3.amazonaws.com/CBERS4/MUX/090/084/CBERS_4_MUX_20170528_090_084_L2.json"
},
{
"rel": "catalog",
"rel": "parent",
"href": "https://cbers-stac.s3.amazonaws.com/CBERS4/MUX/090/catalog.json"
},
{
"rel": "collection",
"rel": "dataset",
"href": "https://cbers-stac.s3.amazonaws.com/collections/CBERS_4_MUX_L2_collection.json"
}
],
Expand Down
6 changes: 2 additions & 4 deletions item-spec/examples/digitalglobe-sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,17 +80,15 @@
"dg:platform": "WORLDVIEW02",
"dg:product_level": "LV1B",
"dg:product": "WORLDVIEW02_LV1B",
"datetime": "2015-11-09T18:04:46.000Z",
"provider": "DigitalGlobe",
"license": "(C) COPYRIGHT 2016 DigitalGlobe, Inc., Longmont CO USA 80503"
"datetime": "2015-11-09T18:04:46.000Z"
},
"links": [
{
"rel": "self",
"href": "https://s3.amazonaws.com/digitalglobe-catalog-spec/collections/dg_worldview02_LV1B/103001004B316600_P002_MUL"
},
{
"rel": "collection",
"rel": "dataset",
"href": "https://s3.amazonaws.com/digitalglobe-catalog-spec/collections/dg_worldview02_LV1B.json"
}
]
Expand Down
5 changes: 1 addition & 4 deletions item-spec/examples/landsat8-sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,6 @@

"properties": {
"datetime": "2014-06-02T09:22:02Z",
"provider": "USGS",
"license": "PDDL-1.0",
"c:id": "L1T",
"c:name": "Landsat L1T",
"c:description": "Landat 8 imagery that is radiometrically calibrated and orthorectified using ground points and Digital Elevation Model (DEM) data to correct relief displacement.",
Expand All @@ -61,8 +59,7 @@
"links": [
{ "rel":"self", "href": "http://landsat-pds.s3.amazonaws.com/L8/153/025/LC81530252014153LGN00/LC81530252014153LGN00.json"},
{ "rel":"alternate", "href": "https://landsatonaws.com/L8/153/025/LC81530252014153LGN00", "type": "html"},
{ "rel":"catalog", "href": "http://landsat-pds.s3.amazonaws.com/L8/catalog.json"},
{ "rel":"collection", "href": "http://landsat-pds.s3.amazonaws.com/L8/L1T-collection.json"}
{ "rel":"dataset", "href": "http://landsat-pds.s3.amazonaws.com/L8/L1T-collection.json"}
],

"assets" :{
Expand Down
1 change: 0 additions & 1 deletion item-spec/examples/planet-sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@
"id": "20171110_121030_1013",
"properties": {
"datetime": "2017-11-10T12:10:30.535417Z",
"provider": "Planet",
"eo:cloud_cover": 23,
"eo:gsd": 4,
"eo:sun_azimuth": 101.8,
Expand Down
4 changes: 1 addition & 3 deletions item-spec/examples/sample-full.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@
},
"properties": {
"datetime": "2016-05-03T13:22:30.040Z",
"provider": "http://www.cool-sat.com",
"license": "CC-BY-4.0",
"eo:sun_azimuth": 168.7,
"eo:cloud_cover": 0.12,
"eo:off_nadir": 1.4,
Expand All @@ -35,7 +33,7 @@
"links": [
{"rel": "self", "href": "http://cool-sat.com/catalog/CS3-20160503_132130_04/CS3-20160503_132130_04.json"},
{"rel": "thumbnail", "href":"thumbnail.png"},
{"rel": "catalog", "href": "http://cool-sat.com/catalog/"},
{"rel": "dataset", "href": "http://cool-sat.com/catalog/"},
{"rel": "acquisition", "href": "http://cool-sat.com/catalog/acquisitions/20160503_56"}
],
"assets": {
Expand Down
7 changes: 3 additions & 4 deletions item-spec/examples/sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,11 @@
]
},
"properties": {
"datetime": "2016-05-03T13:21:30.040Z",
"provider": "http://www.cool-sat.com",
"license": "CC-BY-4.0"
"datetime": "2016-05-03T13:21:30.040Z"
},
"links": [
{ "rel": "self", "href": "http://cool-sat.com/catalog/CS3-20160503_132130_04/CS3-20160503_132130_04.json"}
{ "rel": "self", "href": "http://cool-sat.com/catalog/CS3-20160503_132130_04/CS3-20160503_132130_04.json"},
{ "rel": "dataset", "href": "http://cool-sat.com/catalog.json"}
],
"assets": {
"analytic": {
Expand Down
6 changes: 5 additions & 1 deletion item-spec/examples/sentinel2-sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@
"links": [
{
"rel": "self",
"href": "s3://sentinel-s2-l2a-catalog/tiles/35/V/MK/2018/6/5/0/catalog.json"
"href": "s3://sentinel-s2-l2a-catalog/tiles/35/V/MK/2018/6/5/0/sentinel2-sample.json"
},
{
"rel": "dataset",
"href": "s3://sentinel-s2-l2a-catalog/catalog.json"
}
],
"bbox": [
Expand Down
Loading