Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataFrame.unnest/2 #758

Merged
merged 1 commit into from
Dec 7, 2023

Conversation

costaraphael
Copy link
Contributor

Building on top of #756, this adds the DataFrame.unnest/2 function.

Copy link
Member

@philss philss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@josevalim
Copy link
Member

Do we need both unnest and explode? Maybe it should be a single function that figures it out based if the column type is a list or a struct? 🤔

@cigrainger
Copy link
Member

cigrainger commented Dec 6, 2023

Do we need both unnest and explode? Maybe it should be a single function that figures it out based if the column type is a list or a struct? 🤔

Unnest is column-wise and explode is row-wise, so that's the rub. Where it gets interesting is in lists of structs and, similarly, the kind of nesting that you can do with nested dataframes in R. So in R, extreme 'tidy-ness' leads to interesting patterns, like so: https://r4ds.had.co.nz/many-models.html#nested-data

I think your question logically leads to the question of whether we want to support that sort of thing and, if so, how. I think it's really powerful and would love to support it. I think having a good, intuitive API around unnesting is essential to making it work.

So, for example, would explode on a list of structs also unnest or do you have to do those steps one after the other?Could we support doing both through an option or a third function? Would unifying to unnest feel unintuitive if you're just exploding a list column?

Edit: if we have a path to represent trained Scholar models as struct dtypes now, that basically completes the circle.

Copy link
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the differences now, thanks @cigrainger and @costaraphael! Let's ship this.

@philss philss merged commit 48ca4e6 into elixir-explorer:main Dec 7, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants