Skip to content

Commit

Permalink
Appease remark
Browse files Browse the repository at this point in the history
  • Loading branch information
duckontheweb committed Apr 29, 2021
1 parent 2fad079 commit 8dcd685
Showing 5 changed files with 25 additions and 29 deletions.
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -31,16 +31,19 @@ This field is optional. If not provided, its expected that the split property wi

#### bbox and geometry

* `ml-aoi` Multiple items may reference the same label and image item by scoping the `bbox` and `geometry` fields. TODO: Better describe scoping of overlap between raster and label items?
* `ml-aoi` Items `bbox` field may overlap when they belong to different `ml-aoi:split` set.
* `ml-aoi` Items in the same Collection should never have overlapping `geometry` fields.
- `ml-aoi` Multiple items may reference the same label and image item by scoping the `bbox` and `geometry` fields. TODO: Better describe scoping
of overlap between raster and label items?
- `ml-aoi` Items `bbox` field may overlap when they belong to different `ml-aoi:split` set.
- `ml-aoi` Items in the same Collection should never have overlapping `geometry` fields.

## Links

`ml-aoi` Item must link to both label and raster STAC items valid for its area of interest.
These Link objects should set `rel` field to `derived_from` for both label and feature items.

`ml-aoi` Item should be contain enough metadata to make it consumable without the need for following the label and feature link item links. In reality this may not be practical because the use-case may not be fully known at the time the Item is generated. Therefore it is critical that source label and feature items are linked to provide the future consumer the option to collect additional metadata from them.
`ml-aoi` Item should be contain enough metadata to make it consumable without the need for following the label and feature link item links. In
reality this may not be practical because the use-case may not be fully known at the time the Item is generated. Therefore it is critical that
source label and feature items are linked to provide the future consumer the option to collect additional metadata from them.

| Field Name | Type | Name | Description |
| ------------- | ------ | ---- | --------------------------- |
@@ -127,7 +130,7 @@ If the tests reveal formatting problems with the examples, you can fix them with
npm run format-examples
```

# Design Decisions
## Design Decisions

Central choices and rational behind them is outlined in the ADR format:

6 changes: 2 additions & 4 deletions docs/0001-record-architecture-decisions.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
---
id: 0001-record-architecture-decisions
title: 1 - Recording Architecture Decisions
---
# 1. Record architecture decisions

Date: 2020-08-08

## Status
11 changes: 5 additions & 6 deletions docs/0002-use-case-definition.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
---
id: 0002-use-case-definition
title: 2 - Use Case
---
# 2. Us- case definition

Date: 2020-08-10

## Status
@@ -31,7 +29,8 @@ For instance it is possible to apply a single source of ground-truth building la
`ml-aoi` Item links to both raster STAC item and label STAC item.
In this relationship the source raster and label items are static and long lived, being used by several `ml-aoi` catalogs.
By contrast `ml-aoi` catalog is somewhat ephemeral, it captures the training set in order to provide model reproducibility and provenance.
There can be any number of `ml-aoi` catalogs linking to the same raster and label items, while varying selection, training/testing/validation split and class configuration.
There can be any number of `ml-aoi` catalogs linking to the same raster and label items, while varying selection, training/testing/validation split
and class configuration.

## Decision

@@ -40,4 +39,4 @@ We will adopt the use and development of `ml-aoi` extension in future machine-le
## Consequences

We will not longer attempt to use `label` extension as a sole source of training data for ML models.
We will continue development of tools to both produce and consume `ml-aoi` extension catalogs.
We will continue development of tools to both produce and consume `ml-aoi` extension catalogs.
12 changes: 5 additions & 7 deletions docs/0003-test-train-validation-split.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
---
id: 0003-test-train-validation-split
title: 3 - Test Train Validation Split
---
# 3. Test-train-validation split

Date: 2020-08-10

## Status
@@ -22,7 +20,7 @@ The which items are selected for this split will effect model performance and sh
In context of a STAC catalog there are multiple ways to express the data split.
This ADR explores available options and their consequences.

##### Split by Collection
### Split by Collection

Split could be generated by generating a separate collection for each set. This is a flexible approach.
However, the grouping of these collections into one cohesive training set would have to be done by convention, for instance by prefix on collection `id`.
@@ -33,7 +31,7 @@ Additionally the convention of how to associate training with testing with valid
Further it would be easy to include a single item in both training and testing set without realizing it.
This is not a good choice for these reasons.

##### Split by Link property
### Split by Link property

The top-most `ml-aoi` collection has to link to each item or child catalogs.
These links could have additional property that designates the split.
@@ -43,7 +41,7 @@ However, when ingested into STAC API this link property is often lost and is not
Thus the split set membership would not be visible to through STAC API, which is bad.
This is not a good choice for that reason.

##### Split by Item property
### Split by Item property

Each item could have an extension specific property (ex: `ml-aoi:split`) that designates set membership.
This approach addresses the short-comings of the previous methods.
12 changes: 5 additions & 7 deletions docs/0004-multiple-label-items.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
---
id: 0003-multiple-label-items
title: 1 - Multiple Label Items
---
# 4. Multiple label items

Date: 2020-08-11

## Status
@@ -15,7 +13,7 @@ This would be a useful feature for training multi-class classifiers.
One can imagine having a label STAC item for buildings and separate STAC item for fields.
STAC Items Links object is an array, so many label items could be linked to from a single `ml-aoi` STAC Item.

#### Limiting to single label link
### Limiting to single label link

Limiting to single label link however is appealing because the label item metadata could be copied over to `ml-aoi` Item.
This would remove the need to follow the link for the label item during processing.
@@ -25,7 +23,7 @@ If multi-class label dataset would be required there would have to be a mechanic
existing labels into a single STAC `label` item. This could mean either union of GeoJSON FeatureCollections per item or
a configuration of a more complex STAC `label` Item that links to multiple label assets.

#### Allowing multiple labels
### Allowing multiple labels

The main appeal of consuming multi-label `ml-aoi` items is that it would allow referencing multiple label sources,
some which could be external, without the need for pre-processing and thus minimizing data duplication.
@@ -51,4 +49,4 @@ The resulting label catalog can capture that design and iteration required for t

`ml-aoi` Items can copy all `label` extension properties from the `label` Item.
In effect `ml-aoi` Items extends `label` item by adding links to feature imagery.
This formulation lines up with original problem statement for `ml-aoi` extension.
This formulation lines up with original problem statement for `ml-aoi` extension.

0 comments on commit 8dcd685

Please sign in to comment.