Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add skew and kurtosis aggregators. #1946

Merged
merged 1 commit into from
Mar 25, 2022

Conversation

JovanVeljanoski
Copy link
Member

Notes:

Checklist:

  • implement 'df.skew' and vaex.agg.skew
  • implement 'df.kurtosis and vaex.agg.kurtosis
  • Unit tests for everything
  • Code review

@maartenbreddels maartenbreddels force-pushed the feat_agg_skew_kurthosis branch from 2574425 to 95c9e9e Compare March 7, 2022 16:26
@maartenbreddels maartenbreddels merged commit da58927 into master Mar 25, 2022
@maartenbreddels
Copy link
Member

nice work!

@maartenbreddels
Copy link
Member

A different solution which would be shorter would be something like this:

@@ -574,7 +576,17 @@ def var(expression, ddof=0, selection=None, edges=False):
 @register
 def skew(expression, selection=None, edges=False):
     '''Create a skew aggregation.'''
-    return AggregatorDescriptorSkew('skew', expression, 'skew', selection=selection, edges=edges)
+
+    sum_moment1 = _sum_moment(expression, 1, selection=selection, edges=edges)
+    sum_moment2 = _sum_moment(expression, 2, selection=selection, edges=edges)
+    sum_moment3 = _sum_moment(expression, 3, selection=selection, edges=edges)
+    count_ = count(expression, selection=selection, edges=edges)
+
+    m1 = sum_moment1 / count_
+    m2 = sum_moment2 / count_
+    m3 = sum_moment3 / count_
+    skew = (m3 - 3*m1*m2 + 2*m1**3) / (m2 - m1**2)**(3/2)
+    return skew

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE-REQUEST] prod / skew / kurtosis aggregations
2 participants