Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage and Features from the same /collections/{collectionid|coverageid}? #65

Closed
dblodgett-usgs opened this issue May 28, 2020 · 18 comments

Comments

@dblodgett-usgs
Copy link

In 7.4.1 Collections there are a few things that could use some clarification. These questions are intentionally leading some discussion regarding #8 and opengeospatial/ogcapi-common#140.

First, there is the following text:

Each collection on a Coverages API will be a coverage.

Is this precluding a spatial data resource that is both a coverage AND a feature-collection?


Second, in 7.4.2.2 response there is:

itemType:
    description: indicator about the type of the items in the collection (the default value is 'unknown').

What is the expected itemType for a coverage that can be accessed as both a coverage or features?


Spatial data servers typically list datasets available from the server in some kind of catalog. This catalog would typically lists "access types" for each dataset. In the case of a dataset with both features and coverages access, what would the server link to in both cases?

In the case of a feature collection, it's clear. it would link to the collection.

e.g. https://info.geoconnex.us/collections/hu02

But what if that collection is both accessible as a feature collection and a coverage? Would the server link to the same collection resource? Would it link to /collections/{coverageid}/coverage/ ?


The last question is what are we gaining by trying to overload /collections and have one link relation data ? As the questions above illustrate, the basic details of cataloging collections of just two types -- especially if both types are desired at the same time -- are not well established. What would be lost if we used /coverages and a link relation specific to coverage access instead of /collections at the root of the coverages API?

@pomakis
Copy link

pomakis commented May 29, 2020

I see no problem with a collection providing both vector features and a coverage. The vector features would be accessed via /collections/{collectionId}/items and the coverage would be accessed via /collections/{collectionId}/coverage. The question still stands as to what "itemType" the collection should declare itself as. Perhaps "itemType" should be allowed to be an array, e.g., [ "featureSet", "coverage" ].

CubeWerx is finding the need to allow collections that provide neither vector features nor coverages, but merely provides maps and tiles via /collections/{collectionId}/map, etc. Such can be the case when it's cascading map-only services such as WMS, or when the user doesn't have authorization to access the source data. Should an "itemType" enumeration value for this type of collection be defined too? Perhaps "map" could be an "itemType", so a collection that provides both coverages and maps could declare its "itemType" to be [ "coverage", "map" ], while a map-only collection could declare its "itemType" as simply "map".

I don't like the idea of having separate /collections and /coverages endpoints. That splits the endpoint hierarchy up way too early. E.g., where would map-only collections go? And where would catalogues (and other future collection types) go?

@pomakis
Copy link

pomakis commented May 29, 2020

I see one hiccup in my suggestion already. I assume "itemType" is meant to indicate the type of items to expect from the /collections/{collectionId}/items endpoint. That's problematic right there, because not all collections will have a working /items endpoint (coverages and map-only collections being prime examples). I don't want to rock the boat to much, but perhaps this field should be called "collectionType" instead, and be an array. So a "collectionType" value of [ "featureSet", "coverage", "map" ] would clearly indicate the collection can provide vector features, a coverage, and a map. The "OGC API - Features" specification would specify that vector features of "featureSet" collections can be accessed at /collections/{collectionId}/items, the "OGC API - Coverages" specification would specify that the coverage of "coverage" collections can be accessed at /collections/{collectionId}/coverage, and the "OGC API - Maps" specification would specify that map layers of "map" collections can be accessed at /collections/{collectionId}/map.

@pomakis
Copy link

pomakis commented May 29, 2020

In addition, all three of these endpoints (.../items, .../coverage and .../map) would be listed in the "links" section of the /collections/{collectionId} endpoint resource, with the appropriate "rel" types.

@dblodgett-usgs
Copy link
Author

I just want to make the point that re-using /collections without having solid answers to the basic mechanics of re-use indicates that we have focused on the trees (details of coverage access) and lost track of the forest (broader context of use).

@pomakis said:

I don't like the idea of having separate /collections and /coverages endpoints. That splits the endpoint hierarchy up way too early. E.g., where would map-only collections go? And where would catalogues (and other future collection types) go?

E.g., where would map-only collections go?

At /maps.

And where would catalogues (and other future collection types) go?

Since catalog items are functionally features, they work with /collections as feature-collections.

By saying "map-only collections" and "and other future collection types" you are presuming re-use of collections as a flexibly typed container for geospatial resources. Do you see how this is circular reasoning? Is there a reason to not split the endpoint hierarchy earlier that doesn't presume the definition of /collections is a container for flexibly typed geospatial resources?


To try and focus here, let me reiterate my questions and see if we can start to progress this.

Is this precluding a spatial data resource that is both a coverage AND a feature-collection?

I think the answer here is no, so an action should be taken to change that wording if this collections debate ends up with us using /collections for both feature collections and coverages.

What is the expected itemType for a coverage that can be accessed as both a coverage or features?

It seems this hasn't been fully thought through. Is it an array? Do we need a collectionType in addition?

Would the server link to the same collection resource? Would it link to /collections/{coverageid}/coverage/ ?

I don't think this has been answered.

The last question is what are we gaining by trying to overload /collections and have one link relation data ?

I don't think this has been answered.

@pomakis
Copy link

pomakis commented May 29, 2020

What is the expected itemType for a coverage that can be accessed as both a coverage or features?
[...]
is this precluding a spatial data resource that is both a coverage AND a feature-collection?

I think part of the confusion here is what the "itemType" property means. It certainly needs some clarification. I think the intent of it is to declare the type of items served by the /collections/{collectionId}/items endpoint. For vector-feature collections the itemType is "feature" and for catalogue collections the item type is "record". For other types of collections (such as coverages and map-only collections) there is no /collections/{collectionId}/items endpoint, so perhaps it stands to reason that such collections shouldn't declare an item type.

I think it can be argued that a collection can be both a coverage and a feature collection if it has both /collections/{collectionId}/coverage and /collections/{collectionId}/items endpoints. In this case it would declare an "itemType" of "feature", because that's what the /collections/{collectionId}/items endpoint would serve.

However, I'm not even sure if it makes conceptual sense for a spatial data resource to be both a coverage and a feature collection.

@pomakis
Copy link

pomakis commented May 29, 2020

you are presuming re-use of collections as a flexibly typed container for geospatial resources.

Yes, I am. I thought that was one of the main points of the OGC API.

Let's explore what things would look like if we have separate /collections, /coverages and /maps endpoints. If a collection is mappable, would the map be accessed with /collections/{collectionId}/map or with /maps/{mapId}?

  • If the former, then what happens if a user doesn't have access rights to the source data? Would that user have to access it through /maps/{mapId}?

  • Also, if the former, if a user wants to see what maps are available, would they need to look under all three of the /collections/{collectionId}/map, /coverages/{coverageId}/map and /maps/{mapId} endpoints?

  • If the latter, then presumably mappable collections and mappable coverages would need to provide links to the appropriate /maps/{mapId} resources. And conversely, the /maps/{mapId} resources would provide links to the /collections/{collectionId} and /coverages/{coverageId} resources that the maps are rendered from. With all of this excessive cross-linking, the /collections/{collectionId}, /coverages/{coverageId} and /maps/{mapId} endpoints are going to look awfully similar.

I honestly think part of the problem here is the term "collection". It (in my opinion) unnecessarily begs the question of "collection of what?". It understandably makes the "OGC API - Coverages" and "OGC API - Maps" people uneasy, because what is a coverage a collection of? And what is a map a collection of? One could say that a coverage is a collection of samples and a map is a collection of pixels, but these aren't typically the units of interaction with these types of resources. So coverages and maps fit a bit uneasily under the /collections endpoint. But I don't think the solution is to therefore split things up into different endpoints. I think we need to provide a clearer definition of the term "collection" that doesn't beg the question of "collection of what?", or else replace it with a different term. "Spatial data resource" is unfortunately a bit too verbose to use as a path element in a URL.

@dblodgett-usgs
Copy link
Author

Thanks for the thoughtful responses. And apologies for being a bit blunt earlier.

I was totally conflating "itemType" and (the missing) "collectionType". Realizing now that a collection with no items would not have an "itemType" at all actually makes this issue even stranger.

Does anyone see a need to declare a "collectionType" so a client can tell that a given collection is either a feature collection or something else? Would we expect Features to add a "collectionType" of "featureCollection"?

I suppose my first question stands though -- if a collection is to be accessed as either features or a coverage, how do we reconcile the statement that the collection is a coverage? And yes, there is such a thing as a feature-coverage that's actually very common.

This all comes back to as you say:

Yes, I am. I thought that was one of the main points of the OGC API.

A lot of people have been going on that assumption, IMHO, without really having thought it through.

API-Features restricts an API to distribution of one dataset. This means that each collection is some grouping of the features in the dataset. Using this logic, an OGC API is actually just a web API that gives access to a dataset (or some functionality).

Re:

With all of this excessive cross-linking,

I don't think putting the dataset access down under collections helps with that. There will always be the case that the service-in-question doesn't contain the data that needs to be included or things need to be handled by reference for whatever reason. Whether the information is off the root of the API or under a collections path is kind of irrelevant.

Finally -- you are right re the collections term. Jamming all this stuff that is kinda sorta not really a collection under that end point name is problematic. IMHO, the solution is to use end points off the root that are better suited -- /coverages here.

As an initial core accross OGC-API, again IMHO, we should restrict these APIs to distribute one and only one dataset. As a near-term additional specification, we should build standard methods to integrate multiple OGC-API dataset distributions under some kind of hierarchical structure as has been discussed over in opengeospatial/ogcapi-common#11. There are a ton of ways that this could be done -- none of which have been adequately explored.

@joanma747
Copy link

I was one of the original supporters of the idea of having /collections/{CollectionId}/coverage

After seeing the endless discussions about I deeply, deeply regret to have been one of the first supporting the idea. It was a dream to see all services in one comprehensive API but it has become a nightmare.

I believe we need a way out. And one way out is the one proposed by @dblodgett-usgs:

As a near-term additional specification, we should build standard methods to integrate multiple OGC-API dataset distributions under some kind of hierarchical structure. There are a ton of ways that this could be done -- none of which have been adequately explored.

I support this idea and we should start exploring the options.

Back to coverages, I'm supporting the idea of going back to /coverages and allowing the coverage group to have full flexibility in what they need to have a powerful OGC API coverages. I'm supporting to have a OWS Common that allows a single dataset to be served as collections (only for features) or coverages (only for... can you guess?)

Possible hierarchical structure;
www.myservice.com/datasets/weather/collections/stations/items/NewYork
www.myservice.com/datasets/weather/coverages/temperature/rangeset
www.myservice.com/datasets/weather/api
www.myservice.com/datasets/weather/conformance
www.myservice.com/datasets/airquality/collections/stations/items/NewYork
www.myservice.com/datasets/airquality/coverages/no2/rangeset
www.myservice.com/datasets/airquality/api
www.myservice.com/datasets/airquality/conformance

Who has specified "datasets"? nobody yet. I do not care: it is just links; HTTP allows that. In the future we could define "datasets" or "stac" or "records"... who knows.

I insist: we need a way out. I'm sorry. I was wrong.

@ghobona
Copy link
Contributor

ghobona commented May 29, 2020

Cc: @jerstlouis @Schpidi

@jerstlouis
Copy link
Member

jerstlouis commented May 29, 2020

@joanma747 I back the dream, I am not giving up on it, I am sad to see you regret it and to say you were wrong :) But I understand feeling this way after some of the pushback it has triggered lately, and I agree we need to find a way to settle these discussions. However in this consortium of 520+ member organizations, pushback from a few does not necessarily mean the idea is fundamentally wrong, and if we can better understand and address the concerns of those raising issues, perhaps even they might come to find it adequate.

At our Coverage meeting this last Wednesday both @Schpidi and @tomkralidis were very supportive of the current approach, where coverages re-use and extend this common approach to geospatial data for Coverages. @tomkralidis in particular agreed how convenient this is from a GIS client to access data sources the same way regardless of their types.

If both Features and Coverages are already agreeing about a shared manner to distribute data, why is there so much controversy?
As I have iterated before, I very much agree with @pomakis that the only controversy is about the term "collection" itself, not the concept of a shared base geospatial resource. This is why I suggest that we temporarily put the term "Collections" and the /collections literal aside and rename Part 2 to OGC API - Common "Geospatial data" and see if we can make progress this way.

Keep in mind that a single dataset may contain both feature collections and coverages.

@jerstlouis
Copy link
Member

jerstlouis commented May 29, 2020

@dblodgett-usgs

The last question is what are we gaining by trying to overload /collections and have one link relation data ?

  • We are trying to represent a specific "leaf" (most granular) data entity, regardless of its data type, at a single end-point.

    • If it is accessible as a coverage, /coverage can be appended to it.
    • If it is accessible as vector features, /items can be appended to it.
    • If it is accessible as 3D Tiles, /3DTiles can be appended to it.
  • We are also enabling to list all such leaf data entities at the same level, e.g. to list all of them part of a single dataset.

  • We are providing a generic manner by which to query the description of such an entity, e.g. its spatio-temporal extent, or to retrieve a list of these entity descriptions.

  • We are providing a mechanism by which a processing workflow can refer to such a data resource, regardless of data type, by pointing to resources of the same type (e.g. the parser can expect the same base schema).

Is this precluding a spatial data resource that is both a coverage AND a feature-collection?

No, as you could have both /coverage and /items for the two ways to access it.

What is the expected itemType for a coverage that can be accessed as both a coverage or features?

itemType was introduced specifically for Features and Records, and as such would be part of the "extended" base geospatial resource description, not the Common part. It does add to the confusion, unfortunately.

It seems this hasn't been fully thought through. Is it an array? Do we need a collectionType in addition?

A collectionType would not help, as a single geospatial resource might have multiple ways to access it. Currently, the "type" property of links is the only way to determine. I believe we need something more, e.g. an array of supported access mode, "accessAPIs" : [ "Features", "Coverage" ], and/or additional links target attributes.

Would the server link to the same collection resource? Would it link to /collections/{coverageid}/coverage/ ?

It depends whether one wants to link to the spatial resource while being agnostic of access API, or link specifically to a Features or Coverage access API.

@pvretano
Copy link

@joanma747 I don't think you were wrong; actually quite the opposite!

⚠️ I may be way off in what I am about the say because of my limited understanding of coverages so I a priori apologize for my stupidity.

⚠️ Please take everything that I about to say with a grain of salt. I am heavily biased by OGC API Features. I am simply trying to explain how I see the world to try and move the conversation forward. As I said, everything I am about to say may be stupid! ... but this is an important conversation so I want to jump in feet first!

In my simple-minded view of coverages, I see them as a collection of measurements (samples) taken with reference to some subdivision/tessellation of some object space that is somehow geo-located. You also need to know what measurements are being taken in each subdivision/tesselation of the object space. I believe that in the WCS specification these are called the rangeset, the domain set and the range type.

To me, the domain set feels more like metadata/description about the coverage since it is basically telling you how the object space/world was cut up. I believe the object space/world can be cut up in many different ways -- it can be a regular grid, or some other type of tessellation (e.g. hexagons) or even something irregular like county boundaries (?); not sure about that last one but seem to make sense to me. For lack of a better term, I am going to refer to these little sub-spaces of the object space/world as "cells".

Similarly, the range type also feels like metadata/description about what measurements are being made in each cell of the domain set.

Finally, the range set feels like the "data" of a coverage since those are the actual measurements or observations that are taken inside each "cell" of the domain set.

So, if I am understanding everything correctly, a coverage is a collection of measurements each made inside of some defined subdivision (i.e. "cell") of an object space. Associated with each coverage is some description about the "cells" and some description about what measurements/observations were taken in each cell.

Assuming this is correct then I don't see why the /collection/{collectionId} path does not work! I see at least several endpoints that emerge...

  1. A data retrieval endpoint from which you could get the measurements of the coverage or a subset of measurements in a subset of cells by using various query parameters appended to the endpoint. This endpoint is analogous to the /items endpoint for features BUT I would not call it "items" (more about that in a minute). I would call it something like "coverage" and I would call the itemType for the collection something like "sample" or "measurement". I believe that the coverages SWG is calling this endpoint /collections/{collectionId}/coverage/rangeset so I am agreeing when their current approach.

  2. An endpoint to retrieve the domain set definition. I am not sure what operations make sense here since this is necessary descriptive information but if there are any they would be specified here. Again, I believe the coverages SWG is calling this endpoint /collections/{collectionId}/coverage/domainset which works for me.

  3. An endpoint to retrieve the range type. This one feels like the /collections/{collectionId}/schema endpoint that some feature APIs implement. The schema endpoint is not formally defined in the features specification but I have a feeling that something more formal would be needed for coverages and sure enough, I believe that the coverages SWGL is calling this endpoint /collections/{collectionId}/coverage/rangeType.

Now, back to /collection/{collectionId}/items. Someone earlier mentioned that it is possible that a coverage can be accessed as a feature. I am not sure what exactly that means. Does it mean that the cells (and the measurements therein) of a coverage are retrieved as individual features or does it mean that there is some other set of features associated with the coverage that can be retrieved. Dunno but in either case if these feature thingies are retrieved as discreet items, then the /collections/{collectionId}/items endpoint seems the natural access path. This approach is completely analogous to the map subtree that is being defined off of /collections/{collectionId} path for getting maps and ties into what both @pomakis and @jerstlouis seem to be saying.

So,@joanma747 I don't think you made a mistake at all ... with my limited understanding of coverages is all seems to hang together. It is not perfect but it does seem to resolve an awful lot of problems we had with the old Web APIs that seemed to be silos unto themselves!

@tomkralidis
Copy link
Contributor

Hi all: simplistic thoughts out loud:

  • having a uniform /collections approach in the workflow of opening a single API (with all sorts of datasets) helps in the client having slimmer software in parsing the bare bone things of collections (title, description, extents, etc.)
  • this would also help servers provide "on board" catalogues of the data they serve pretty easily
  • do we want the root of an OGC API to help determine the "data types" or a property within a unified content model of a "resource" (which we are calling a collection)?

Suggestions:

  • in a collection's schema, replace itemType with a collectionType enum, and model/extend after something like ISO's MD_SpatialRepresentationTypeCode
  • whether we go with /collections, or /coverages, /tiles etc., have the respective collectionInfo.yaml inherit from a generic collection content model (from what would become Common Part 2). Or maybe even an OGC API - Records record model? There should be no restriction to additionalProperties as needed by the downstream specification's "resource" model

@jerstlouis
Copy link
Member

jerstlouis commented May 30, 2020

@tomkralidis The problem with "collectionType" enum, is that a collection could be available as both a coverage and a feature collection, or as a coverage and 3D Tiles. This is why I was suggesting earlier an array of such an enum instead, like "accessAPIs" or "availableTypes" or even "collectionTypes".

@cmheazel
Copy link
Contributor

Recommendation:

  1. We retain the /collections/{collectionId} construct.
  2. CIS resources live at /collections/{collectionId}/coverage
  3. /collections/{collectionId}/coverage is further specialized by the top-level CIS constructs (domainSet, rangeType, rangeSet, and metadata)
  4. A collection can be polymorphic. Consider a point cloud where each pixel has an associated location. The pixels can be accessed as:
    a) A chip of the rangeSet (/collections/{collectionId}/coverage/rangeSet)
    b) A Feature Collection (/collections/{collectionId}/items)
    Note that the collectionId is the same in both cases
  5. We need a way to identify the representations (types) available for a specific collection (currently itemType).
  6. A single collection may have multiple representations (feature, coverage, map, tile, etc.).
  7. We need to agree on a solution and update API-Common before the Sprint next week.

Dafinitions:

  1. collection: a bucket into which you throw things
  2. item: something you throw into the bucket

@jerstlouis
Copy link
Member

@cmheazel Agreeing with everyting except 5, and the Collection definition which goes against the one agreed in Part 2: Geospatial data.

As previously discussed in common, itemType is singular.
But specific media types for target resources avoid the need for this altogether.

@jerstlouis
Copy link
Member

jerstlouis commented Oct 23, 2020

I believe this issue can be closed as Overcome By Events, and also a possible duplicate of #25

As mentioned there and in my comment just above:

itemType is strictly for OGC APIs making use of /collections/{collectionId}/items (e.g. Features and Records).
What determines that a collection is accessible as a coverage is the presence of a link with the http://www.opengis.net/def/rel/ogc/1.0/coverage relation type (normally pointing to a /coverage resource).

@Schpidi
Copy link
Member

Schpidi commented Oct 28, 2020

Coverages SWG call: Agreed to close.

@Schpidi Schpidi closed this as completed Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants