Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cudf._lib.orc in favor of inlining pylibcudf #17466

Merged
merged 10 commits into from
Dec 6, 2024

Conversation

mroeschke
Copy link
Contributor

Description

Contributes to #17317

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 27, 2024
@mroeschke mroeschke self-assigned this Nov 27, 2024
@mroeschke mroeschke requested a review from a team as a code owner November 27, 2024 23:51
@github-actions github-actions bot added CMake CMake build issue pylibcudf Issues specific to the pylibcudf package labels Nov 27, 2024
Comment on lines 1549 to 1550
# pyarrow. To make sure the cudf format is interperable
# in arrow, we use `int8` type when converting from a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# pyarrow. To make sure the cudf format is interperable
# in arrow, we use `int8` type when converting from a
# pyarrow. To make sure the cudf format is interoperable
# with arrow, we use `int8` type when converting from a

Comment on lines 32 to 35
try:
import ujson as json # type: ignore[import-untyped]
except ImportError:
import json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why this? I think we're only using this to load the (tiny) metadata json dict, so it seems perhaps unnecessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with just using json; I was just porting over the equivalent behavior

Comment on lines 289 to 298
def __init__(self, Table table):
self.c_obj = table_input_metadata(table.view())
self.column_metadata = [

@property
def column_metadata(self):
return [
ColumnInMetadata.from_libcudf(&self.c_obj.column_metadata[i], self)
for i in range(self.c_obj.column_metadata.size())
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What is this change needed for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this was an (unused) cdef attribute, but cudf Python ORC handling modifies the column_metadata inplace, so I needed to expose this attribute in Python space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add it as a property in types.pyi if it is not already done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Added in 5e27f2a

@mroeschke
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit b6f7e6e into rapidsai:branch-25.02 Dec 6, 2024
104 of 105 checks passed
@mroeschke mroeschke deleted the cudf/_lib/orc branch December 6, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants