-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add h3Distance, and internal h3ToIjk and ijkDistance #83
Conversation
src/apps/applib/lib/utility.c
Outdated
H3Index bc; | ||
setH3Index(&bc, 0, i, 0); | ||
int childrenSz = H3_EXPORT(maxUncompactSize)(&bc, 1, res); | ||
H3Index* children = calloc(childrenSz, sizeof(H3Index)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this should be stack allocated
Looks like the build may have timed out due to iterating several resolutions of indexes. |
What is the purpose of exposing this when its stated that the values produced should not be compared between different versions of H3? If this is for hexagon distance calculation, why not simply implement a function that does that but will return |
h3ToIjk has other uses, for example when using H3 bucketed data in systems which expect a matrix of data. I think some good examples would be using H3 bucketed data in a GPU or in a system like Tensorflow. Having non-hierarchical coordinates (which IJK+ fulfills) is needed for that case. |
While I do understand the convenience of the IJK coordinates, I fear that most users won’t understand the issues involved and this might lead to confusion and frustration. Maybe better to keep those issues internal where they can be cleanly resolved; @dfellis solution gives a path to achieving that. Frankly, if I hadn’t been so over-cautious (cowardly?) there would be barely any I wonder if many of the use cases that the IJK coordinates support might be served by a mapping of the H3 indexes of a single resolution, in their implicit integer order, to a compact 1-dimensional sequence. Function |
@sahrk aren't the compact 64-bit integer forms of H3 already a type of sequence number, though? I do see @isaacbrodsky 's point about wanting a coordinate system and not just a 1-d sequence for matrix-type operations, but just exposing the CoordIJK values without being clear about their limitations seems like a not-great idea. What about a function that takes an origin hexagon and a "k-ring" size and creates a simple 2D IJ coordinate system with The first can be used to define boundaries (can return That one would take longer to figure out and probably wouldn't have to be in the first release, but this allows for an implementation that won't depend on internals that may or may not change in the future. I can implement that if you guys want. |
The 64-bit integer H3 indexes form a sequence, just not a compact one. So they can't be used directly for @isaacbrodsky 's data buffer case. For the physical layout of a data buffer it's not clear to me how a 3D buffer (with a redundant axis) is more tractable than a 1D buffer that retains the same structural information (e.g., you could define the neighbor function, metric distance, etc., using sequence numbers, even if it was just done by translating back-and-forth to H3 indexes). Perhaps there are two use cases here that don't necessarily have the same solution: an H3-organized data buffer, and a planar hex grid coordinate system that allows users to perform arbitrary grid coordinate manipulations that aren't yet supported by H3 directly. For the latter @dfellis I like your idea of generating the planar coordinate system as needed for a particular problem. And if it's going to be used for the physical layout of data buffers then 2D makes more sense than 3D. I'm just afraid that the pentagons are going to make this approach very messy to scale to large geographical regions, whereas most of those complications go away for the 1D approach (the one complication I still see is that, since pentagons have a different sized descendent footprint than the hexagon base cells you can't use a constant stride length to step through the buffer by base cell). |
I think keeping the function internal and only exposing a This approach is focused on smaller geographical regions, on the assumption that if you're working with data near the base cell level, performing the Haversine distance calculation is reasonable enough, and the distances could even be precomputed. It certainly gets very messy to deal with pentagons. :) It was suggested to me that the API could provide the hexagons directly in a 2D planar coordinate format (perhaps named CoordIJ), but that would necessarily need to call a function like I'm not sure that @dfellis' solution will work. It seems like it would require the base cell grid to be an IJK coordinate system, and I'm not sure that works due to pentagon distortion. (I'd also hope for a solution which doesn't have time or memory complexity relative to the distance between the indexes, unless it can be precomputed i.e. if we could precompute something for all base cells). I think the sequence number idea that @sahrk brought up is interesting, but as you mentioned above I think there are a few different use cases, so I think there may be cases where generating a problem-specific coordinate space would be advantageous. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think this looks good, but I support the suggestion that we keep h3ToIjk
internal for the moment.
* Rotate an H3Index 60 degrees clockwise about a pentagonal center. | ||
* @param h The H3Index. | ||
*/ | ||
H3Index _h3RotatePent60cw(H3Index h) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is 60cw
right here, or are we really rotating 72 degrees? Is there a more semantic way to express this, like _h3RotatePentSideCW
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parallels the existing _h3RotatePent60ccw
, which is a special case of _h3Rotate60ccw
/_h3Rotate60cw
. I think it could be argued we're rotating 60 degrees, but also skipping over part of the coordinate space.
*/ | ||
int h3ToIjk(H3Index origin, H3Index h3, CoordIJK* out) { | ||
if (H3_GET_MODE(origin) != H3_HEXAGON_MODE || | ||
H3_GET_MODE(h3) != H3_HEXAGON_MODE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'm not sure if this is worthwhile. We technically ought to be doing this check on every H3 index algo, and then there's a slippery slope about what we ought to be checking - e.g. would h3IsValid
be better here? I have mixed feelings about this, since it's basically a service to the developer to guard against stuff we can't type-check, but because it has perf issues in some cases, we can only do it in some functions and not others. It would be nice if we had a consistent approach to this across all algo functions.
The ideal option would be to make this a compile-time check. I don't see any way to do that with externally-provided indexes though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using assertions makes sense if you have people using debug builds versus release builds. Other languages tend to default to release builds so that is still an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some of the other algos don't do this because they do not have a way to return error cases to the caller. Other functions like geoToH3 do check for out of bounds inputs and return 0, and geoToH3 is certainly performance critical.
I would rather not use assertions, especially in release builds because instead of propagating an error it would abort the entire process, which would probably be bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions are not compiled into release builds. The definition of assert
looks something like this:
#ifdef NDEBUG /* Release mode */
#define assert(cond) /* Empty definition */
#else /* Debug mode */
#define assert(cond) ... /* < Define assert macro */
#endif
|
||
int res = H3_GET_RESOLUTION(origin); | ||
|
||
if (res != H3_GET_RESOLUTION(h3)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Is there a reason to do this check here and not in other algos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertion is better as I stated above.
src/h3lib/lib/h3Index.c
Outdated
int revDir = 0; | ||
if (originBaseCell != baseCell) { | ||
for (dir = 1; dir < 7; dir++) { | ||
int testBaseCell = _getBaseCellNeighbor(originBaseCell, dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest encapsulating this as Direction getBaseCellNeighborDirection(int origin, int dest)
, in baseCells.c
, returning -1
on failure.
int pentagonRotations = 0; | ||
int directionRotations = 0; | ||
|
||
if (originOnPent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a concrete suggestion here, but is it possible to move the pentagon handling blocks to a separate function to hide the complexity they entail?
src/h3lib/lib/mathExtensions.c
Outdated
} else { | ||
return b; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the classic case of where you should use a macro. Unless you can move this to a header (requires use of static inline
so not strictly C89 compliant), a macro is the better choice here:
#define _IMAX(a, b) (((a) > (b)) ? (a) : (b))
@@ -42,4 +42,8 @@ void geoBoundaryPrint(const GeoBoundary* b); | |||
void geoBoundaryPrintln(const GeoBoundary* b); | |||
int readBoundary(FILE* f, GeoBoundary* b); | |||
|
|||
void iterateAllIndexesAtRes(int res, void (*callback)(H3Index)); | |||
void iterateAllIndexesAtResPartial(int res, void (*callback)(H3Index), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, C users would probably prefer macros here and you'll get better performance because C compiler can see what is going on in a macro, but function pointers are opaque. Side note: this is why using the C qsort
in C++ is slower than std::sort
from C++ STL, due to function pointers being worse off than templates.
Anyway, here's how I'd do it based on Jansson:
#define H3_FOR_EACH_INDEX_RES(argName, res) H3_FOR_EACH_INDEX_PARTIAL((res), NUM_BASE_CELLS)
#define H3_FOR_EACH_INDEX_PARTIAL(argName, res, baseCells) \
assert((baseCells) <= NUM_BASE_CELLS); \
for (int i = 0; i < baseCells; i++) { \
H3Index bc; \
setH3Index(&bc, 0, i, 0); \
int childrenSz = H3_EXPORT(maxUncompactSize)(&bc, 1, (res)); \
STACK_ARRAY_CALLOC(H3Index, children, childrenSz); \
H3_EXPORT(uncompact)(&bc, 1, children, childrenSz, (res)); \
for (int j = 0; j < childrenSz; j++) { \
if (children[j] == 0) { \
continue; \
} \
H3Index argName = children[j];
Then users can do this:
H3_FOR_EACH_INDEX_AT_RES(1, myIndex) {
printf("%lld\n", myIndex);
}
Which expands to:
for (...) {
...
H3Index myIndex = children[j];
printf("%lld\n", myIndex);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this, but I am not a huge fan of macros. Since this is a test, I don't think performance is that big a concern.
edit: also, this does not permit reusing the code for more than one resolution. That still needs to be a function call or another macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya everyone hates macros until they program in C for a while and realize it's the only way to avoid seriously ugly code. Also, underrated technique: X macro.
Meanwhile, rethinking this API too so that the blocks have matching parentheses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK seeing as this is a test I don't really think it's worth the effort. Thought this might lead to a convenience API in general. Never mind my suggestion.
src/apps/testapps/testH3ToIjk.c
Outdated
void h3Distance_kRing_assertions(H3Index h3) { | ||
int maxK = 5; | ||
int r = H3_GET_RESOLUTION(h3); | ||
if (r == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch is more performant and cleaner than else if
chain.
src/apps/testapps/testKRing.c
Outdated
int kSz = H3_EXPORT(maxKringSize)(k); | ||
|
||
H3Index *neighbors = calloc(kSz, sizeof(H3Index)); | ||
int *distances = calloc(kSz, sizeof(int)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like stack allocations make more sense, unless this is an example to avoid difficulties with build tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was refactored out of a previous test without updating the allocations.
*/ | ||
int h3ToIjk(H3Index origin, H3Index h3, CoordIJK* out) { | ||
if (H3_GET_MODE(origin) != H3_HEXAGON_MODE || | ||
H3_GET_MODE(h3) != H3_HEXAGON_MODE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using assertions makes sense if you have people using debug builds versus release builds. Other languages tend to default to release builds so that is still an issue.
|
||
int res = H3_GET_RESOLUTION(origin); | ||
|
||
if (res != H3_GET_RESOLUTION(h3)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertion is better as I stated above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
I'm still not convinced exposing the |
@dfellis this pr no longer exposes CoordIJK as part of the public api. (It could be used by code willing to reach into internals) I think you mean you prefer a 2d coordinate system derived from the origin, rather than the current ijk+ one used internally. |
Wait, let me refresh this page. I leave Chrome tabs open for months, sometimes they break in weird ways, maybe I'm still seeing an old version of the code. |
CHANGELOG.md
Outdated
@@ -7,6 +7,9 @@ The public API of this library consists of the functions declared in file | |||
|
|||
## [Unreleased] | |||
### Added | |||
- h3ToIjk function for getting IJK+ coordinates from an index (#83) | |||
- Internal `ijkDistance` function for determining the grid distance between IJK+ coordinates (#83) | |||
- Internal `h3Distance` function for determining the grid distance between H3 indexes (#83) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, you're right that it's not doing what I thought, but these CHANGELOG statements look wrong. h3Distance
is marked as an export below, while ijkDistance
and h3ToIjk
are not.
Also the h3ToIjk
filter application is not mentioned in these notes and I'm not entirely sure what it's purpose is, now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, those are reversed. h3Distance should be the public one, the others are internal. I rebased and updated.
h3ToIjk
is just for experimental use.
Adds an h3ToIjk function which produces users consumable IJK+ coordinates. An "origin" for the coordinate system must be specified, and coordinates are only comparable if they were obtained from the same origin. This is limited to obtaining coordinates for indexes which are on the same base cell or neighboring base cells. Pentagons have more restrictions about crossing the pentagonal distortion. The intention is in the future this could be improved to handle pentagon distortion better, which may make current coordinates invalid. Coordinates should not be persisted or passed between library versions. Pentagon cases were mostly restricted by experimentally disabling a number of cases. With IJK+ coordinates, it's possible to determine the grid distance of cells, with the same limitations of h3ToIjk. This is done under the assumption that the distances that are of interest are below the granularity of a base cell (at that resolution, haversine may be a better approach.) Along with this change, CoordIJK is moved to the public API.
Also change some tests to stack allocation and remove an `else if` chain.
Also updated some copyright headers
Until better solutions are available this sure is handy for lots of use cases. But I concur with @dfellis that a 2D |
I agree with @sahrk and I don't even think it would be that hard? Especially if the proposed IJK -> IJ coord transform can be applied (though it looks like Dr. Sahr removed it?) I just referenced this PR from another PR since it seems an efficient polyfill is needed, and a |
@dfellis I removed the straight-forward transformation into the I haven't tested this but I believe it will be:
|
Probably now moot on this particular PR, but @dfellis it occurred to me that for your polyfill use case (and many others) it probably makes sense to keep the orientation of cells constant around the same origin (within the limits of pentagon distortion), even if the resulting 2D components are negative. For that you could use An alternative might be to use two H3 indexes to define a coordinate system, and then pass that coordinate system to |
@dfellis I'd like to point out regarding uber/h3-java#21 that this approach does support crossing icosahedron edges, it doesn't support crossing more than one base cell, and it doesn't support crossing pentagons. @sahrk like you said, I think we need to preserve orientation. I don't have any objection to adding CoordIJ and related public API functions, but I am wondering if adding them in a separate PR makes more sense. My main concern is that |
@isaacbrodsky if you preprocess the polygons into an array of polygons intersected by the icosahedron faces, that point is moot. |
@isaacbrodsky some of this discussion was based on the code confusion over whether @dfellis if |
My plan is to merge this early next week. It might be best to open an issue to discuss a public "H3Index to planar coordinates" API, or else that discussion can happen on a separate PR proposing such an API. |
Add h3Distance, and internal h3ToIjk and ijkDistance
Experimental.
Adds an h3ToIjk function which produces users consumable IJK+ coordinates. An "origin" for the coordinate system must be specified, and coordinates are only comparable if they were obtained from the same origin. This is limited to obtaining coordinates for indexes which are on the same base cell or neighboring base cells. Pentagons have more restrictions about crossing the pentagonal distortion. The intention is in the future this could be improved to handle pentagon distortion better, which may make current coordinates invalid. Coordinates should not be persisted or passed between library versions.
Pentagon cases were mostly restricted by experimentally disabling a number of cases.
With IJK+ coordinates, it's possible to determine the grid distance of cells, with the same limitations of h3ToIjk. This is done under the assumption that the distances that are of interest are below the granularity of a base cell (at that resolution, haversine may be a better approach.)
Along with this change, CoordIJK is moved to the public API.
edit: this would be version 3.1.0.