-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No query
or describe
- call out common but missing methods?
#440
Comments
Thx for the report! I will add mean to the list of available methods, thx for pointing it out. We are making a push to fill out the API at the moment, so I am hopeful that this will change very very soon. Describe isn't as trivial as the others unfortunately, since we need to do some groundwork there, but query is easy. You are correct with your assumption about median and percentile, bot of them require array integration that isn't build yet (should change relatively soon as well) |
@phofl I'm working with @ianozsvald and I've just started experimenting with dask-expr, replicating some well understood computations that we've also implemented with Dask & Polars. ETA - having worked around the lack of query, performance is very good 👍 |
That sounds good! Feel free to let me know if you find anything that' slower than you would expect.
Yes that's correct, that is a little bit annoying but introspection on Python UDFs is not reliable unfortunately, that's my main motivation for #386 (which is basically the |
We added describe as well now, will push out a release later today, so closing here |
Having just experimented with
dask-expr
(it is nice!), might it be worth noting that whilstquery
isn't available, a mask works as expected and that works, e.g.ddf.query("customer_years < 3 and customer_type=='b'").compute()
is the same asAlso
mean
isn't listed as being available on the homepage (outside of a groupby/resample) but it worked ok for me on a column (e.g.ddfx[['d_0']].mean()
).It would be lovely to see
.describe()
just because it is used early in a workflow, so it is likely that new users will hit its absence (which presumably means you'd needmedian
,percentile
etc).The text was updated successfully, but these errors were encountered: