-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transpose API to pylibcudf #16749
Add transpose API to pylibcudf #16749
Conversation
Looks like this test expects the data pointer to be exposed after transpose ______________________________ test_df_transpose _______________________________
[gw7] linux -- Python 3.10.14 /opt/conda/envs/test/bin/python3.10
manager = <SpillManager device_memory_limit=N/A | 0B spilled | 57B (28%) unspilled (unspillable)>
def test_df_transpose(manager: SpillManager):
df1 = cudf.DataFrame({"a": [1, 2]})
df2 = df1.transpose()
# For now, all buffers are marked as exposed
assert df1._data._data["a"].data.owner.exposed
> assert df2._data._data[0].data.owner.exposed
E assert False
E + where False = <cudf.core.buffer.spillable_buffer.SpillableBufferOwner object at 0x7f6b538952d0>.exposed
E + where <cudf.core.buffer.spillable_buffer.SpillableBufferOwner object at 0x7f6b538952d0> = SpillableBuffer(owner=<cudf.core.buffer.spillable_buffer.SpillableBufferOwner object at 0x7f6b538952d0>, offset=0, size=8).owner
E + where SpillableBuffer(owner=<cudf.core.buffer.spillable_buffer.SpillableBufferOwner object at 0x7f6b538952d0>, offset=0, size=8) = <cudf.core.column.numerical.NumericalColumn object at 0x7f6b53886830>\n[\n 1\n]\ndtype: int64.data
tests/test_spilling.py:580: AssertionError Would this require |
python/cudf/cudf/_lib/transpose.pyx
Outdated
# Notice, the data pointer of `result_owner` has been exposed | ||
# through `c_result.second` at this point. | ||
result_owner = Column.from_unique_ptr( | ||
move(c_result.first), data_ptr_exposed=True | ||
) | ||
return columns_from_table_view( | ||
c_result.second, | ||
owners=[result_owner] * c_result.second.num_columns() | ||
input_table = plc.table.Table( | ||
[col.to_pylibcudf(mode="read") for col in source_columns] | ||
) | ||
_, result_table = plc.transpose.transpose(input_table) | ||
return [Column.from_pylibcudf(col) for col in result_table.columns()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madsbk: can you remind me what it means that the result_owner
is exposed through the table
(c_result.second
).
Is it that we have, now, two Buffer
s that point to the same data, and therefore if we were to spill one, we would need to spill the other?
I think this is right, and so yes, I think we do need (@mroeschke) to have a way of marking a column's data as exposed when we import it from pylibcudf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it that we have, now, two Buffers that point to the same data, and therefore if we were to spill one, we would need to spill the other?
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is right, and so yes, I think we do need (@mroeschke) to have a way of marking a column's data as exposed when we import it from pylibcudf.
There is a data_ptr_exposed
keyword in from_pylibcudf
that currently isn't implemented. I think we need to pass that parameter through to the exposed
keyword in as_buffer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this was addressed in #16760
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it looks like the necessary parameter was handled there so this PR should be safe to merge now.
/merge |
Description
Contributes to #15162
Checklist