Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Refactor Column/NamedColumn split in cudf-polars #16272

Closed
wence- opened this issue Jul 12, 2024 · 0 comments · Fixed by #16962
Closed

[FEA] Refactor Column/NamedColumn split in cudf-polars #16272

wence- opened this issue Jul 12, 2024 · 0 comments · Fixed by #16962
Assignees
Labels
0 - Backlog In queue waiting for assignment cudf.polars Issues specific to cudf.polars feature request New feature or request

Comments

@wence-
Copy link
Contributor

wence- commented Jul 12, 2024

Currently NamedColumn inherits from Column, however, because the constructor of the former takes an additional (required) argument, parent class factory methods that return Self cannot be used for the child class.

This is painful because it means that we must replicate the code: one can't even do super().bla because if that constructs a Self it will have a name missing.

Really, named columns only appears when constructing dataframes: when evaluating expressions, the column can safely drop the name.

In the expression evaluator, the only place a namedcolumn appears is when evaluating a Col expression (which pulls a column out of a dataframe). But at that point, we could drop the name.

This suggests that we shouldn't have that a NamedColumn is-a Column, but rather a NamedColumn has-a Column (and a name).

@wence- wence- added feature request New feature or request 0 - Backlog In queue waiting for assignment cudf.polars Issues specific to cudf.polars labels Jul 12, 2024
@wence- wence- self-assigned this Jul 12, 2024
wence- added a commit to wence-/cudf that referenced this issue Aug 29, 2024
This returns a pylibcudf column, so no metadata, which is awkward. We
will change that when rapidsai#16272 is fixed.
wence- added a commit to wence-/cudf that referenced this issue Sep 2, 2024
This returns a pylibcudf column, so no metadata, which is awkward. We
will change that when rapidsai#16272 is fixed.
wence- added a commit to wence-/cudf that referenced this issue Oct 1, 2024
Everything in the expression evaluation now operates on columns
without names. DataFrame construction takes either a mapping from
string-valued names to columns, or a sequence of pairs of names and
columns.

This removes some duplicate code in the NamedColumn class (by removing
it) where we had to fight the inheritance hierarchy.

- Closes rapidsai#16272
wence- added a commit to wence-/cudf that referenced this issue Oct 1, 2024
Everything in the expression evaluation now operates on columns
without names. DataFrame construction takes either a mapping from
string-valued names to columns, or a sequence of pairs of names and
columns.

This removes some duplicate code in the NamedColumn class (by removing
it) where we had to fight the inheritance hierarchy.

- Closes rapidsai#16272
@wence- wence- mentioned this issue Oct 1, 2024
3 tasks
@GPUtester GPUtester moved this from Todo to In Progress in cuDF Python Oct 1, 2024
wence- added a commit to wence-/cudf that referenced this issue Oct 1, 2024
Everything in the expression evaluation now operates on columns
without names. DataFrame construction takes either a mapping from
string-valued names to columns, or a sequence of pairs of names and
columns.

This removes some duplicate code in the NamedColumn class (by removing
it) where we had to fight the inheritance hierarchy.

- Closes rapidsai#16272
wence- added a commit to wence-/cudf that referenced this issue Oct 1, 2024
Everything in the expression evaluation now operates on columns
without names. DataFrame construction takes either a mapping from
string-valued names to columns, or a sequence of pairs of names and
columns.

This removes some duplicate code in the NamedColumn class (by removing
it) where we had to fight the inheritance hierarchy.

- Closes rapidsai#16272
@rapids-bot rapids-bot bot closed this as completed in 219ec0e Oct 8, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment cudf.polars Issues specific to cudf.polars feature request New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant