Inconsistency in the result of groupby with multiple columns #2049
Unanswered
Eisbrenner
asked this question in
Q&A
Replies: 1 comment 1 reply
-
I tested the result now when using numpy instead (aside from the grouping for uniques which is still done by vaex) and the issue is persistent, so it must be inside the way I use vaex for binning of uniques. def nphist(data, expression, iminmax, jminmax):
groups = data.df[expression].groupby(
["id", "ii", "ij"],
agg={
"id": vaex.agg.first("id", "id"),
"subvol": vaex.agg.first("subvol", "id"),
},
)
z, y, x = np.histogram2d(
groups.ii.values,
groups.ij.values,
bins=(
int(np.max(iminmax) + 1 - np.min(iminmax)),
int(np.max(jminmax) + 1 - np.min(jminmax)),
),
weights=groups.subvol.values,
range=[
[np.min(iminmax), np.max(iminmax)],
[np.min(jminmax), np.max(jminmax)],
],
)
return z.T
f, axs = plt.subplots(ncols=2, nrows=1, figsize=(10, 4))
axs.flat[0].contourf(nphist(data, expression1, ienw, jenn))
axs.flat[1].contourf(nphist(data, expression2, ienw, jenn))
f.patch.set_facecolor("w") |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a snippet of code (below) which unfortunately I couldn't create a small working example from.
However, I binby three columns to get uniques and export the result to a xarray object and this process sometimes works and sometimes messes up the coordinates. In the figures below I think the process succeeded until it concatenated some data below the rest even though it was supposed to go above.
For reference I have the plots of the expression without the process of filtering for uniques.
and then when filtering here the right one is "broken". It should be a subset of the right plot in the first figure, but it got this weird bar on the bottom which should go above I think.
So what am I messing up? :-/
non-reproducable code below:
Beta Was this translation helpful? Give feedback.
All reactions