Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep GeoJSON (vector) properties #270

Closed
m-mohr opened this issue Jul 9, 2021 · 7 comments
Closed

Keep GeoJSON (vector) properties #270

m-mohr opened this issue Jul 9, 2021 · 7 comments
Assignees
Milestone

Comments

@m-mohr
Copy link
Member

m-mohr commented Jul 9, 2021

Re 3: In general, it would be very useful to keep the GeoJSON properties so that people can actually identify the Features they've provided in the results. I'm not sure if the processes describe this well enough, but we can probably clarify this in general for vector file formats that handle properties.

Originally posted by @m-mohr in Open-EO/openeo-geopyspark-driver#84 (comment)

@clausmichele
Copy link
Member

After some discussion, we agreed that this feature is crucial in the following scenario:

  1. We need to train a ML model
  2. We create a geoJSON containing the polygons with associated classes (for classification) or target values (for regression)
    2.1 (Optional) We use the (still to be defined) process polygon_to_points process to extract several points from the geometries.
  3. We use aggregate_spatial to create the input vector-cube for the ML model training process.

The ML model training process will need to distinguish which one is the class/target from the output of aggregate_spatial (back-end specific?).

For other scenarios where we do not define the labels/classes/target by ourselves inside the geoJSON but use instead two different rasters the problem does not persist.

@m-mohr
Copy link
Member Author

m-mohr commented Dec 16, 2021

For aggregate_spatial I'm wondering whether it would be the easiest solution to add a new parameter keep_properties (default: false) that can be enabled to add this behavior? Or do we assume that this should be possible by default and just "clarify" that keeping the properties is the expected (but badly documented) behavior of the existing process?

@m-mohr m-mohr added question Further information is requested help wanted Extra attention is needed and removed question Further information is requested labels Dec 21, 2021
@jdries
Copy link
Contributor

jdries commented Jan 12, 2022

I'm considering to start keeping them by default in our processes, mainly to avoid that users would suddenly have to specify this extra argument. Keeping those properties should also not break existing code.
Backends that do not keep them yet, but also do not support training ML models, would not be hugely impacted.

@jdries
Copy link
Contributor

jdries commented Jan 12, 2022

Dev telco: avoid adding a parameter, try to implement in the backends.
Process documentation would need to be updated.

@m-mohr m-mohr removed the help wanted Extra attention is needed label Jan 12, 2022
@m-mohr
Copy link
Member Author

m-mohr commented Jan 12, 2022

Would a simple addition such as "Vector properties are preserved for vector data cubes and all GeoJSON Features." be good enough?

cc @soxofaan

@soxofaan
Copy link
Member

yes, I would not overthink it for now. We can finetune it if actual implementation details become more clear

m-mohr added a commit that referenced this issue Jan 12, 2022
m-mohr added a commit that referenced this issue Jan 12, 2022
@m-mohr
Copy link
Member Author

m-mohr commented Jan 12, 2022

Done, see commits above (and potentially below).

@m-mohr m-mohr closed this as completed Jan 12, 2022
m-mohr added a commit that referenced this issue Jan 12, 2022
@m-mohr m-mohr modified the milestones: 1.3.0, 2.0.0 Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants