Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized functions #147

Merged
merged 12 commits into from
Jul 20, 2020
1 change: 1 addition & 0 deletions src/h3/_cy/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ add_cython_file(geo)
add_cython_file(cells)
add_cython_file(edges)
add_cython_file(to_multipoly)
add_cython_file(unstable_vect)
8 changes: 4 additions & 4 deletions src/h3/_cy/h3lib.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ cdef extern from "h3api.h":
int i
int j

H3Index geoToH3(const GeoCoord *g, int res)
H3Index geoToH3(const GeoCoord *g, int res) nogil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you determine which functions get nogil? Is it just the ones you want to optimize right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added it to the functions I needed it for in this PR. We could probably add nogil to each of the functions in the API, as the nogil tag just promises that the function won't manipulate any Python objects, which is true for everything in the h3-c API.

I'll probably do a more thorough job of this when I tackle #140


void h3ToGeo(H3Index h3, GeoCoord *g)
void h3ToGeo(H3Index h3, GeoCoord *g) nogil

void h3ToGeoBoundary(H3Index h3, GeoBoundary *gp)

Expand Down Expand Up @@ -82,9 +82,9 @@ cdef extern from "h3api.h":

void destroyLinkedPolygon(LinkedGeoPolygon *polygon)

double degsToRads(double degrees)
double degsToRads(double degrees) nogil

double radsToDegs(double radians)
double radsToDegs(double radians) nogil

double hexAreaKm2(int res)

Expand Down
59 changes: 59 additions & 0 deletions src/h3/_cy/unstable_vect.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
cimport h3lib
from h3lib cimport H3int
from .util cimport deg2coord

from cython cimport boundscheck, wraparound
from libc.math cimport sqrt, sin, cos, asin

cdef double haversineDistance(double th1, double ph1, double th2, double ph2) nogil:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this belong in this binding?

I don't think we provide haversine distance support in any of the other bindings?

Copy link
Contributor Author

@ajfriend ajfriend Jul 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering anything under h3.unstable as "not in the binding". There was a request for an optimized cell_haversine function, so I'm putting it in to make it easy for them to play with, and evaluate the speedups.

In the short term, this is hard to do anywhere except within the Cython bindings. In the future however, we could consider making it easy for folks to do this outside the Cython bindings, by letting them write their own Cython code which just does cimport h3 to get access to the C and Cython code in these bindings. Here's an issue around it: #152

I'd say that's probably the way to go in the future.

Copy link
Contributor

@kylebarron kylebarron Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty easy to make a vectorized haversine in numpy only. So for the future if/when vectorized functions make it out of unstable, it might be ideal to only make h3_to_geo vectorized, then the end user can have a vectorized pipeline without touching cython themselves:

# a, b arrays of hex ids
a_geo = h3_to_geo(a)
b_geo = h3_to_geo(b)
dist = haversine(a_geo, b_geo)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will expose a haversine function when uber/h3#377 lands:

https://github.com/uber/h3/blob/930cb1be6efe88ef397fb443027a2a86c0616c88/src/h3lib/include/h3api.h.in#L265-L269

It would be easy to add that to the h3-py "vectorized" API, whenever we figure out the best way to do that. I'd be interesting to compare it to the "numpy-vectorized" version you have above!

cdef:
double dx, dy, dz
double R = 6371.0088

ph1 -= ph2

dz = sin(th1) - sin(th2)
dx = cos(ph1) * cos(th1) - cos(th2)
dy = sin(ph1) * cos(th1)

return asin(sqrt(dx*dx + dy*dy + dz*dz) / 2)*2*R


@boundscheck(False)
@wraparound(False)
cpdef void haversine_vect(
const H3int[:] a,
const H3int[:] b,
double[:] out
) nogil:

cdef h3lib.GeoCoord p1, p2

with nogil:
# todo: add these back in when cython 3.0 comes out
#assert len(a) == len(b)
#assert len(a) <= len(out)

for i in range(len(a)):
h3lib.h3ToGeo(a[i], &p1)
h3lib.h3ToGeo(b[i], &p2)
out[i] = haversineDistance(
p1.lat, p1.lng,
p2.lat, p2.lng
)

@boundscheck(False)
@wraparound(False)
cpdef void geo_to_h3_vect(
const double[:] lat,
const double[:] lng,
int res,
H3int[:] out
) nogil:

cdef h3lib.GeoCoord c

with nogil:
for i in range(len(lat)):
c = deg2coord(lat[i], lng[i])
out[i] = h3lib.geoToH3(&c, res)
4 changes: 2 additions & 2 deletions src/h3/_cy/util.pxd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from .h3lib cimport H3int, H3str, GeoCoord

cdef GeoCoord deg2coord(double lat, double lng)
cdef (double, double) coord2deg(GeoCoord c)
cdef GeoCoord deg2coord(double lat, double lng) nogil
cdef (double, double) coord2deg(GeoCoord c) nogil

cpdef H3int hex2int(H3str h) except? 0
cpdef H3str int2hex(H3int x)
Expand Down
4 changes: 2 additions & 2 deletions src/h3/_cy/util.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ from .h3lib cimport H3int, H3str, h3IsValid, h3UnidirectionalEdgeIsValid
cimport h3lib


cdef h3lib.GeoCoord deg2coord(double lat, double lng):
cdef h3lib.GeoCoord deg2coord(double lat, double lng) nogil:
cdef:
h3lib.GeoCoord c

Expand All @@ -15,7 +15,7 @@ cdef h3lib.GeoCoord deg2coord(double lat, double lng):
return c


cdef (double, double) coord2deg(h3lib.GeoCoord c):
cdef (double, double) coord2deg(h3lib.GeoCoord c) nogil:
return (
h3lib.radsToDegs(c.lat),
h3lib.radsToDegs(c.lng)
Expand Down
1 change: 1 addition & 0 deletions src/h3/unstable/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
)

from . import v4
from . import vect
52 changes: 52 additions & 0 deletions src/h3/unstable/vect.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
from .._cy import unstable_vect as _vect

import numpy as np


def geo_to_h3(lats, lngs, res):
"""
Convert arrays describing lat/lng paris to cells.

Parameters
----------
lats, lngs : arrays of floats

res: int
Resolution for output cells.

Returns
-------
array of H3Cells
"""
assert len(lats) == len(lngs)

out = np.zeros(len(lats), dtype='uint64')

_vect.geo_to_h3_vect(lats, lngs, res, out)

return out


def cell_haversine(a, b):
"""
Compute haversine distance between the centers of cells given in
arrays a and b.


Parameters
----------
a, b : arrays of H3Cell

Returns
-------
float
Haversine distance in kilometers
"""

assert len(a) == len(b)

out = np.zeros(len(a), dtype='double')

_vect.haversine_vect(a, b, out)

return out