Implement "nlargest()", "nsmallest()" methods. #579

garciparedes · 2020-02-05T15:47:13Z

It would be interesting to implement this kind of methods over the DataFrame class. I think the needed memory to compute that values shouldn't be so big and could be really interesting to provide insightful indicators.

Here is the pandas counterpart documentation:

The text was updated successfully, but these errors were encountered:

maartenbreddels · 2020-02-07T19:55:57Z

Hi Sergio,

yes, I think we can do this. It's a bit tricky with how vaex works internally, but I'll keep this in mind when doing some refactor work. Let's keep this issue open as a reminder.

cheers,

Maarten

garciparedes · 2020-02-08T12:33:21Z

Hi @maartenbreddels,

Thank you so much for accepting this feature request. 🙂

If you don't mind, I've a doubt about why the vaex implementation of this kind of statistics would be a bit tricky. Is the reason related with the "vectorial" nature of the output?

maartenbreddels · 2020-02-12T07:28:09Z

it has to do with how vaex filters. Vaex always works with the unfiltered raw data, which means it is always tricky to map between an unfiltered index (say the 6th element of the unfiltered array) and the filtered index.

maartenbreddels added the feature-request label Feb 7, 2020

maartenbreddels mentioned this issue Jun 16, 2020

groupby-agg all columns based on one column #828

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement "nlargest()", "nsmallest()" methods. #579

Implement "nlargest()", "nsmallest()" methods. #579

garciparedes commented Feb 5, 2020

maartenbreddels commented Feb 7, 2020

garciparedes commented Feb 8, 2020

maartenbreddels commented Feb 12, 2020

Implement "nlargest()", "nsmallest()" methods. #579

Implement "nlargest()", "nsmallest()" methods. #579

Comments

garciparedes commented Feb 5, 2020

maartenbreddels commented Feb 7, 2020

garciparedes commented Feb 8, 2020

maartenbreddels commented Feb 12, 2020