Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector cubes #59

Merged
merged 15 commits into from
Jan 30, 2023
Merged

Add vector cubes #59

merged 15 commits into from
Jan 30, 2023

Conversation

m-mohr
Copy link
Member

@m-mohr m-mohr commented Feb 24, 2022

Adding high-level explanations about vector data cubes.

@m-mohr m-mohr marked this pull request as draft February 24, 2022 14:27
@m-mohr m-mohr mentioned this pull request Feb 24, 2022
16 tasks
@m-mohr m-mohr requested a review from edzer February 25, 2022 11:45
@m-mohr m-mohr force-pushed the vector-cubes branch 2 times, most recently from 07f32fb to 10035f6 Compare March 1, 2022 14:16
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
@m-mohr m-mohr linked an issue Mar 1, 2022 that may be closed by this pull request
16 tasks
Copy link
Member

@soxofaan soxofaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some notes

documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Show resolved Hide resolved
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
documentation/1.0/glossary.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
@m-mohr
Copy link
Member Author

m-mohr commented Sep 7, 2022

I've updated the PR to reflect recent discussions. Please re-review!

@m-mohr
Copy link
Member Author

m-mohr commented Nov 7, 2022

Ready for final review.

documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Show resolved Hide resolved
documentation/1.0/glossary.md Outdated Show resolved Hide resolved
documentation/1.0/glossary.md Outdated Show resolved Hide resolved
documentation/1.0/datacubes.md Outdated Show resolved Hide resolved

Dimension labels are either numerical or text (also known as "strings"), which also includes textual representations of timestamps for example. Dimensions with a natural/inherent order are always sorted. These are usually all spatial and temporal dimensions. Dimensions without inherent order, `bands` in openEO for example, retain the order in which they have been defined in metadata or processes (e.g. through [`filter_bands`](https://processes.openeo.org/#filter_bands)), with new labels simply being appended to the existing labels.
Dimension labels are usually either numerical or text (also known as "strings"), which also includes textual representations of timestamps or vectors for example.
Usually, vector labels (geometries) are encoded as [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) and temporal labels are encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this elsewhere I think, but are WKT strings a good choice for labels?

  • the WKT string can become very large (kilobytes or worse), which does not make it a handy label to work with in the different phases of a user workflow
  • there is quite some room for variation in WKT encoding of a geometry due to float representation/precision or vertex ordering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side question: what does the "usually" refer to here: "usually in the remote sensing/GIS community", or "usually in openEO implementations"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Usually, vector labels (geometries) are encoded as [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) and temporal labels are encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
For example, geometries (i.e. the labels of a geometry dimension) can be encoded in [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) or GeoJSON like temporal labels are usually encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The labels can be chosen by the user, could be ID, WKT2, any other attribute. Doesn't need to be unique, back-ends can have an internal UID (e.g. index)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean that in the sense of as follows?:

  • When supported by the back-end implementation: the labels can be chosen by the user: could be ID, WKT2, any other attribute. Doesn't need to be unique.
  • As fallback, back-ends can use an internal (unique) id (e.g. auto-increment index, UUID, hash function of geometry, ....)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether back-ends should be able to not support it, but we also don't know yet at which point/how users can choose this. So this is to be discussed in openeo-processes.
I guess back-ends always need an internal unique id in addition to the representation.

@m-mohr
Copy link
Member Author

m-mohr commented Jan 17, 2023

Do not allow mixing geometry types (only allow one geometry type + the corresponding multi type).
So we need processes to convert between the types and the load processes, where we would also need a parameter in the load processes to convert to a single type.

@m-mohr
Copy link
Member Author

m-mohr commented Jan 17, 2023

I've updated the PR to reflect (I hope) our discussions today. I'd appreciate a final review! Thanks.

Changes since the meeting: 10cdca0

@soxofaan @mkadunc @aljacob @clausmichele @jdries @dthiex @LukeWeidenwalker @pierocampa

@m-mohr m-mohr requested review from edzer, soxofaan, LukeWeidenwalker and pierocampa and removed request for mattia6690 January 17, 2023 17:46
Copy link
Member

@pierocampa pierocampa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good!

Copy link
Member

@aljacob aljacob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks for the great discussion yesterday and working it all already in consistently!

@LukeWeidenwalker
Copy link
Contributor

The substance of what's written looks all good to me - just a few minor suggestions on the prose, apply whichever you agree with ;)

Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com>
Copy link
Member

@soxofaan soxofaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge.
Thanks for the productive discussions

@m-mohr m-mohr merged commit 72d2eba into master Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vector data cubes (overview)
8 participants