Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return links and assets to lists #151

Closed
wants to merge 2 commits into from
Closed

Conversation

mojodna
Copy link
Collaborator

@mojodna mojodna commented Aug 14, 2018

I'd like to drop the use of lists-as-objects in favor of using vanilla JSON arrays for Item links and assets as well as catalog-level links. My feeling is that the former introduces application-specific "optimizations" into the core spec (in the form of facilitating lookups) while dirtying the semantics associated with those lists and actually adding complexity:

  • JSON objects have no inherent ordering (in practice, they do in most implementations, but that seems like a bad assumption to make); this means that the resulting lists won't necessarily be consistent across viewers
  • a non-semantic element is introduced (the object key) that introduces ambiguity; some implementations will apply meaning to this (a cog key in an asset collection) in ways that potentially conflict with "better" resolution rules (ref / rel / type)
  • a collection of things is inherently a list; to illustrate this, actions like "give me the first asset" (jq .assets[0]) become more difficult (jq ".assets | to_entries | first | .value"; I had a hard time figuring this out)
  • since object keys are now required, STAC generators will need to produce them; most keys will be opaque strings (UUIDs, etc) with duplication needing to become a consideration (merging links collections from multiple catalogs becomes a non-trivial operation (in case keys overlap), especially if it's deemed important to try to retain object keys)

(This is an incomplete PR; additional references to lists-as-objects should be found in prose, JSON schema, and examples.)

Refs #127
Refs #142

@matthewhanson
Copy link
Collaborator

@mojodna I've been the main one pushing for these assets and links to be objects rather than lists, and since I won't be there for this sprint I suspect I'll lose this one.
However @mojodna makes a good case here, especially for auto-generated items. I've been working on some code that generates derived STAC items and have run into this problem of what to name the keys.

The original motivation here was for users to be able to easily specify which assets they are interested in. Being lists doesn't prevent this, just changes the logic in the client for determining it. My main interest is that users must absolutely be able to specify assets by a consistent key across collections, or by a band common_name in the case of the EO extension. If this is still possible switching to lists then I'm on board.

cc @cholmes @scisco

@mojodna
Copy link
Collaborator Author

mojodna commented Aug 15, 2018

In the PR, I included an id attribute. I think that's better as key and can be used to achieve what @matthewhanson wants to achieve, I think.

Also, a role attribute (e.g. thumbnail or primary) may serve some of the purpose of the object keys.

@matthewhanson
Copy link
Collaborator

Since the group meeting Wed morning, I've thought about this more and have an additional argument for why assets should remain as dictionaries. First, as I said before dictionary keys really only become important when using Collections (or whatever they are to be called now) which define a set of common metadata across Items.

When you do have a Collection of Items in this manner though it is important that the assets have keys, and that they are unique so that an Item can be merged with the Collection-level metadata to get a complete record.

Example:

For example, a landsat 8 collection can define the assets:

"assets": {
   "B1": {
      "type": "image/geotiff-cog" 
      "eo:bands": ["B1"],
      "description": "This is an asset for Band 1 (the coastal band)"
}

An individual item can then define just the href to the asset for that item:

"assets": {
   "B1": {
      "href": "path/to/datafile" 
}

The Item can then be merged with the Collection record to get a complete record. This is how sat-search works. The merging may occur on the server side instead, but either way the use of dictionaries provides an unambiguous way to merge these, whereas using lists requires a lot of logic of iterating and finding matching records. Worse, if a key is duplicated the behavior is undefined. Maybe you do want two thumbnails, but different formats for them.

For lists to work we need to ensure that an id field is unique across all items.

Related to #153

cc @mojodna @cholmes @jeffnaus

@cholmes cholmes added this to the 0.6.0 milestone Aug 21, 2018
@cholmes
Copy link
Contributor

cholmes commented Aug 22, 2018

Wanted to summarize where we got during the sprint:

Everyone agreed that links (both in catalogs and in items) should be lists. It was done later, and done to 'align' with assets, but didn't have any real good reasoning and had some drawbacks.

In general people also felt that it was 'ok' to have links and assets not share the exact structure. Though obviously there's a slight preference to keep things aligned.

So perhaps make this PR about just the links. We can keep discussing assets here, or move it to its own issue.

@cholmes
Copy link
Contributor

cholmes commented Oct 12, 2018

Closing this - decision was to change links to an array (which has happened), and to leave assets as dictionary. We are not opposed to changing it at some point (and indeed those in favor of assets as dictionary mostly just want to be sure there is an id that is required + referencable), but aren't going to do it for the 0.6.0 release.

@cholmes cholmes closed this Oct 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants