Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit in vector processes #330

Closed
m-mohr opened this issue Feb 25, 2022 · 9 comments · Fixed by #436
Closed

Unit in vector processes #330

m-mohr opened this issue Feb 25, 2022 · 9 comments · Fixed by #436
Assignees
Milestone

Comments

@m-mohr
Copy link
Member

m-mohr commented Feb 25, 2022

With mostly all vector processes we hit the point where you need to specify a length/distance/... and that needs to be specified in a specific unit. So we have seen different proposals across the PRs:

  1. Always meters
  2. Always degree
  3. Unit of the SRS (usually either meter or degree)

There are several issues with the options:

  1. What does "meters" mean specifically? e.g. Add vector_buffer #314 (comment) This is probably the most user-friendly option?
  2. What does "degree" mean specifically? e.g. Add vector_to_points #313 #315 (comment)
  3. Users need to explicitly understand the SRS, which is the least user-friendly option.

I think it's not a good idea to discuss this for every new process. So what's the preferred option that we adopt in general?

PS: In existing (raster) processes we have:

  • apply_neighborhood: unit of srs or m or px
  • resample_spatial: unit of srs
@soxofaan
Copy link
Member

soxofaan commented Mar 9, 2022

Given the tight timing and the fact that this issue is a blocker for other issues, a simple proposal to have an initial solution that allows to be improved in later stages:

  • user specifies distance parameter as a numeric value, interpreted in the unit of srs
  • there is no unit parameter

It's not ideal because it implies lat-lon degrees for now, but at least it works, unblocks other issues and is future proof (I think)

In a later stage we can add support for a user-chosen unit in some way, e.g.:

  • add a unit parameter
  • distance parameter also accepts an object like {"value": 10, "unit": "m"}
  • reprojection process to change the srs

@m-mohr m-mohr self-assigned this Mar 9, 2022
@mkadunc
Copy link
Member

mkadunc commented Mar 10, 2022

From my experience the "default distance is the unit of CRS" just leads to a lot of errors made by users. I suggest making the unit parameter mandatory from the start, which will ensure that the users understand what the number they're inputting means.

If backends don't want to convert or if there's ambiguity in how the conversion happens, an error can be thrown in case the specified unit doesn't match the CRS unit.

@edzer
Copy link
Member

edzer commented Mar 14, 2022

I agree. And also: degree can never be a distance unit.

@soxofaan
Copy link
Member

So if I understand correctly, the "unit = unit of CRS" approach is not a favorable option, even when it's just an initial solution?

I suggest making the unit parameter mandatory

Instead of a separate unit parameter, I would be more in favor of defining the distance as an object containing both the value and unit, e.g. like

    "distance":  {"value": 10, "unit": "m"},

We already use that approach in apply_neighborhood. I think it's schema-wise easier and more self-contained.

@m-mohr
Copy link
Member Author

m-mohr commented Mar 15, 2022

I'm not sure whether it's the not favorable option or whether @mkadunc just meant that the user should explicitly specify that he wants to specify the value in the unit of the CRS. So should users always explicitly specify a specific unit or is an option for "unit of srs" that the users explicitly selects fine, too?


On the other hand, in date_shift we have a value and a unit parameter separately ;-)
The schemas are simpler with two parameters, but the simplicity of the schemas should not drive our decision too much.

@mkadunc
Copy link
Member

mkadunc commented Mar 15, 2022

"unit_of_crs" would also be acceptable, though not needed IMO (almost all CRSs will have either degree or meter as units, and it shouldn't be too difficult for the user to pick the correct one).

I prefer the syntax with an object — distance: {"value": 10, "unit": "m"}.

Re Edzer's opposition to having 'degree' as a distance unit — not sure what to do when CRS is geographic... Users will treat 'degrees' as just another unit, so it will be very hard to dissuade them, unless we come up with a name for this quantity that isn't "distance" (but e.g. "quasi_euclidean_2_norm_of_coordinate_difference" 😉 ). This is probably not a discussion for this issue, but something that should be addressed elsewhere.

@m-mohr m-mohr added this to the 2.0.0 milestone Nov 28, 2022
@m-mohr
Copy link
Member Author

m-mohr commented Mar 10, 2023

Quick survey:

  • apply_neighborhood: value/unit object
  • date_difference + date_shift: value + unit parameters
  • vector_buffer + vector_to_regular_points: no unit, required in unit of srs

@m-mohr
Copy link
Member Author

m-mohr commented Mar 10, 2023

Vector processes only allow distances in meters.

  • Vector data has CRS with unit meters -> use vector process
  • Vector data has CRS with unit other than meters: -> throw error (or parameter later?) -> use vector_reproject process (to be added) to convert to a CRS with meters -> use vector process then

e.g. in vector_buffer would change from "The distance of the buffer in the unit of the spatial reference system." to "The distance of the buffer in meters." and keep the parameter numerical.

Vote:

For GeoJSON this may need explicit or implicit reprojection, e.g. through load_geojson or vector_reproject -> #412

++
oooo
--

@m-mohr
Copy link
Member Author

m-mohr commented Mar 31, 2023

The decision from #414 (comment) (which we also based on this issue) also applies here. We are strict with the errors and throw an error if not meters. We add a chapter to the implementation guideline that back-ends can diverge from this and still implement more.

@m-mohr m-mohr removed the help wanted Extra attention is needed label Mar 31, 2023
m-mohr added a commit that referenced this issue May 2, 2023
@m-mohr m-mohr linked a pull request May 2, 2023 that will close this issue
m-mohr added a commit that referenced this issue May 4, 2023
* Handle units in vector processes #330

Co-authored-by: Daniel Thiex <60705209+dthiex@users.noreply.github.com>
@m-mohr m-mohr closed this as completed May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants