Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label Extension: Use asset role "labels" to #936

Closed
philvarner opened this issue Dec 21, 2020 · 7 comments
Closed

Label Extension: Use asset role "labels" to #936

philvarner opened this issue Dec 21, 2020 · 7 comments
Assignees
Milestone

Comments

@philvarner
Copy link
Collaborator

philvarner commented Dec 21, 2020

Propose that we change the Label Extension in two ways:

  1. Change the requirement that the name of the label asset is named "labels" to a recommendation that the names "raster_labels" and "vector_labels" be used.
  2. Add an Asset Role labels to indicate which assets are label data.

The recommended names are just recommendations, since names should not matter at all. A user attempting to use an Item should be finding the label asset by find'ing the asset by roles contains "labels" and type == "geojson" (for vector, or tif for raster).

See short discussion in #935

@vpipkt
Copy link

vpipkt commented Dec 21, 2020

It also seems that it would be useful to indicate which Class object(s) and tasks an asset applies to. I am thinking it may be possible that a single asset may not be able to describe all the label:tasks. One possibility seems to be using the label:* properties on assets with labels role in some way.

A more grounded example: suppose an Item is labeled by creation of a segmentation raster into 3 classes. This may result in a single raster with label classes coded in raster values (the spec is not actually clear on how raster values translate to label classes semantic labels (e.g. 10 -> alfalfa). Or it may result in 3 rasters with binary pixel types. Carrying the example further the provider may also prepare vector labels for segmentation and/or object detection based on the raster label; and further could create regression labels. The regression labels could be an additional Feature in an asset or could be a separate asset.

Just some of what I am thinking, as we are trying to implement the standard in specifications with some of this complexity. Looking forward to seeing how this progresses.

@jisantuc
Copy link
Contributor

jisantuc commented Jan 5, 2021

@vpipkt I think for different labeling tasks, it would be easier to have distinct label items. To me this is a good tradeoff between data duplication and extension parseability, what do you think?

I also think maybe it makes sense to have different extensions for vector and raster labels, instead of rasters being a special case with accomodations. There are also already a few kind of awkward "if it's a raster label item ignore this field" cases. Separating the extensions would allow stricter checks when parsing label items and easier specification of raster-specific concerns without polluting the label: namespace.

@vpipkt
Copy link

vpipkt commented Jan 6, 2021

@jisantuc In the case of distinct label items, I am a bit torn. In the example I outlined above creating distinct items would be appropriate, with links among related items. I'm having trouble articulating the other side of my argument. I do like the original proposal of this issue to use an asset role for label, thus potentially allowing multiple labels on an item.

I would certainly welcome some additional clarity for raster labels in particular.

@philvarner
Copy link
Collaborator Author

@jisantuc I concur on separating these -- possibly into 3 extensions -- common, raster, vector

@jisantuc
Copy link
Contributor

jisantuc commented Jan 8, 2021

I'm also really on board for the labels asset role. We've similarly run into problems around lots of things technically being labels, but labels not telling you much, e.g., ML training steps consume labels and prediction steps produce labels, but to STAC, everything is just "labels". Roles would offer a sensible extension point, e.g., labels and training-data both being present in an assets roles.

I'm going to open another issue for the splitting discussion. I'd prefer to avoid thinking about inheritance because of the difficulty enforcing it

@philvarner
Copy link
Collaborator Author

I created a draft PR to make these changes. One problem I had was that some of the existing language doesn't align well with the intention that you can have raster or vector labels here. I tried to avoid rewriting too much of that, but we should revisit it.

@cholmes cholmes modified the milestones: 1.0.0-RC.1, 1.0.1 Feb 25, 2021
@cholmes cholmes linked a pull request Feb 25, 2021 that will close this issue
4 tasks
@cholmes cholmes modified the milestones: 1.0.1, 1.0.0-RC.1 Feb 25, 2021
@cholmes cholmes modified the milestones: 1.0.0-RC.1, 1.0.0 Mar 3, 2021
@m-mohr
Copy link
Collaborator

m-mohr commented Mar 4, 2021

Tackled by stac-extensions/label#1

@m-mohr m-mohr closed this as completed Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants