-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute area-weighted statistics #640
Conversation
"minority": float(keys[counts.tolist().index(counts.min())].tolist()) | ||
if valid_pixels | ||
else numpy.nan, | ||
"mean": float(array.mean()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use the weighted array for the min
and max
(like https://github.com/isciences/exactextract#supported-statistics)
"std": float(array.std()), | ||
"median": float(numpy.ma.median(array)), | ||
"majority": majority, | ||
"minority": minority, | ||
"unique": float(counts.size), | ||
**dict(zip(percentiles_names, percentiles_values)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 maybe the percentiles should use the weighted array?
# 3, 4 | ||
data = np.ma.array((1, 2, 3, 4)).reshape((1, 2, 2)) | ||
|
||
# Coverage Array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should call this the weight
array instead of coverage
? coverage
doesn't seem to make the most sense to me here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in https://github.com/isciences/exactextract, weight are really weight, while the coverage
is called cell coverage fractions
. I didn't want to confuse people 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, coverage
came from exactextract because their weights are usually derived from the coverage of each polygon in the cell. I think weight
is more general than coverage
though. There could be use cases for weighted zonal stats that aren't partial-coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coverage_weights
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we ever have both coverage
and weights
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like exactextract does support both coverage
and weight
in effect, because it allows weight
as a parameter and it computes coverage
itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we ever have both coverage and weights?
Not planned right now but I don't want to be blocked in the future. I agree that coverage
is not a perfect name but I don't want to use weights
because it's not it's not weights
but spatial fraction. I'm open to change to better name if you have ideas :D
(self.height, cover_scale, self.width, cover_scale) | ||
).astype("float32") | ||
|
||
return cover_array.sum(-1).sum(1) / (cover_scale**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly modified version of perrygeo/python-rasterstats#136
@sgoodm I know ☝️ is a 7 years old PR (😅) but I hope you don't mind that I reused some of the code here 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, @jacobwhall and I have been pushing use of COGs in our work at @aiddata and are fans of these projects, so happy to share some code to support 👍 👍
Note: we don't have a BaseReader method that returns statistics for a GeoJSON Feature but this is how it could look like in the with Reader(path) as src:
data = src_dst.feature(
shape,
shape_crs=WGS84_CRS,
)
coverage_array = data.get_coverage_array(
shape, shape_crs=WGS84_CRS
)
stats = data.statistics(coverage=coverage_array) |
ref NASA-IMPACT/veda-backend#226
This PR adds a
coverage
options to theget_array_statistics
function to enablearea weighted
statistics. This PR do not take care of thecoverage
array creation which should be done by the client application (maybe with some helper in rio-tiler)cc @kylebarron @j08lue
To Do
coverage
utility functions