Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Implement all libcudf modules required by cuDF Python in pylibcudf #15162

Closed
vyasr opened this issue Feb 27, 2024 · 1 comment
Closed
Assignees
Labels
feature request New feature or request pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.

Comments

@vyasr
Copy link
Contributor

vyasr commented Feb 27, 2024

Is your feature request related to a problem? Please describe.
pylibcudf is intended to provide a low-level Python interface to the libcudf C++ API. cuDF's internals will ultimately be refactored to depend on pylibcudf. As a first step, we need to expose all libcudf algorithms used by cuDF Cython in pylibcudf.

Describe the solution you'd like
This is a tracking issue for APIs to expose in Cython. The APIs are grouped based on the pxd file exposing libcudf APIs in Cython, which roughly corresponds to namespaces in libcudf.

Module PRs (or assignees) Notes
aggregation.pxd #14945, #14970
binaryop.pxd #14821
column/column.pxd #13562 pylibcudf Columns share ownership
column/column_factories.pxd #15257
column/column_view.pxd #13562 pylibcudf Columns share ownership
concatenate.pxd #15011
contiguous_split.pxd #16953
copying.pxd #13562, #14508, #14640
datetime.pxd #15916, #16776, #17143
expressions.pxd #16056
filling.pxd #15225
groupby.pxd #14945
hash.pxd #15418
interop.pxd #17055
io/arrow_io_source.pxd #16050
io/avro.pxd #15899
io/csv.pxd #16011 #17163
io/data_sink.pxd I think this is finished once we're porting the remaining IO writers. We should also delete helper utility functions in cudf._lib if possible (eg. make_sink_info)
io/datasource.pxd #16050
io/json.pxd #15952 #15966
io/orc.pxd #16042 #17310
io/orc_metadata.pxd #16042
io/parquet.pxd #16078 #17245 #17491 #17506
io/text.pxd #17232
io/timezone.pxd #16771
io/types.pxd
join.pxd #14972
labeling.pxd #16761
lists/combine.pxd #15928
lists/contains.pxd #15981
lists/count_elements.pxd #16072
lists/explode.pxd #15011
lists/extract.pxd #16071
lists/gather.pxd #16170
lists/lists_column_view.pxd #16175
lists/sorting.pxd #16179
lists/stream_compaction.pxd #16184
lists/reverse.pxd #16185
merge.pxd #15011
null_mask.pxd #15908
nvtext/byte_pair_encode.pxd #17101
nvtext/edit_distance.pxd #16957
nvtext/generate_ngrams.pxd #17006
nvtext/jaccard.pxd #17007
nvtext/minhash.pxd #17021
nvtext/ngrams_tokenize.pxd #17070
nvtext/normalize.pxd #17072
nvtext/replace.pxd #17084
nvtext/stemmer.pxd #17085
nvtext/subword_tokenize.pxd #17096
nvtext/tokenize.pxd #17100
partitioning.pxd #16781
quantiles.pxd #15874
reduce.pxd #14970
replace.pxd #15005
reshape.pxd #15827 sans byte_cast which is only used by cpp/java
rolling.pxd #14982
round.pxd #15863
scalar/scalar.pxd #14133
search.pxd #15166 #17271
sorting.pxd #15011
stream_compaction.pxd #15011
strings/convert/convert_booleans.pxd #16971
strings/convert/convert_datetime.pxd #16971
strings/convert/convert_durations.pxd #16982
strings/convert/convert_fixed_point.pxd #16984
strings/convert/convert_floats.pxd #16990
strings/convert/convert_integers.pxd #16991 #17270
strings/convert/convert_ipv4.pxd #16994
strings/convert/convert_lists.pxd #16997
strings/convert/convert_urls.pxd #17003
strings/split/partition.pxd #16940
strings/split/split.pxd #16940
strings/attributes.pxd #16785
strings/capitalize.pxd #15503
strings/case.pxd #15489
strings/char_types.pxd #16788
strings/combine.pxd #16790
strings/contains.pxd #16814
strings/extract.pxd #16823
strings/find.pxd #15604
strings/find_multiple.pxd #16920
strings/findall.pxd #16825
strings/json.pxd #17025
strings/padding.pxd #16833
strings/regex_flags.pxd #15880
strings/regex_program.pxd #15880
strings/repeat.pxd #16834
strings/replace.pxd #15839
strings/replace_re.pxd #17023
strings/side_type.pxd #16833
strings/strip.pxd #16833
strings/substring.pxd #15988
strings/translate.pxd #16934
strings/wrap.pxd #16935
strings_udf.pxd #17107 We decided not to include strings_udf in pylibcudf because (if we did) the C++ we'd be creating bindings for in https://github.com/rapidsai/cudf/tree/branch-24.12/python/cudf/udf_cpp is not a part of the libcudf API.
table/table.pxd #13562 Tables share column ownership in pylibcudf
table/table_view.pxd #13562 Tables share column ownership in pylibcudf
transform.pxd #16760
transpose.pxd #16749
types.pxd #13562 More types added as needed in other PRs
unary.pxd #14850
utilities/span.pxd #17021
wrappers/decimals.pxd #17048
wrappers/durations.pxd #17010
wrappers/timestamps.pxd #17010
@vyasr vyasr added the feature request New feature or request label Feb 27, 2024
@GregoryKimball GregoryKimball moved this to Story Issue in libcudf Feb 28, 2024
rapids-bot bot pushed a commit that referenced this issue Mar 2, 2024
Contributes to #15162

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #15166
rapids-bot bot pushed a commit that referenced this issue Mar 18, 2024
This PR also introduces `std::out_of_range` to cudf's code base in cases where it is appropriate.

Contributes to #12885 
Resolves #15315 
Contributes to #15162

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #15319
rapids-bot bot pushed a commit that referenced this issue Apr 11, 2024
This PR creates `pylibcudf` `case` APIs and migrates the cuDF cython to leverage them. Part of #15162.

Authors:
  - https://github.com/brandon-b-miller
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15489
rapids-bot bot pushed a commit that referenced this issue May 24, 2024
@vyasr vyasr added the pylibcudf Issues specific to the pylibcudf package label May 28, 2024
rapids-bot bot pushed a commit that referenced this issue May 29, 2024
This PR creates the `pylibcudf.strings.capitalize` namespace and migrates the cuDF cython to use it. Depends on #15489

Part of #15162

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15503
rapids-bot bot pushed a commit that referenced this issue May 31, 2024
rapids-bot bot pushed a commit that referenced this issue Jun 5, 2024
xref #15162

Change replace.pxd to use pylibcudf APIs.

Authors:
  - Thomas Li (https://github.com/lithomas1)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15839
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
This PR creates pylibcudf strings `contains` APIs and migrates the cuDF cython to leverage them. Part of #15162.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15880
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
xref #15162 

Starts migrating cudf I/O cython to use pylibcudf APIs, starting with avro.

Authors:
  - Thomas Li (https://github.com/lithomas1)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15899
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
xref #15162 

Migrate quantile.pxd to use pylibcudf APIs.

Authors:
  - Thomas Li (https://github.com/lithomas1)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15874
rapids-bot bot pushed a commit that referenced this issue Nov 7, 2024
Contributes to #15162

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #17232
rapids-bot bot pushed a commit that referenced this issue Nov 7, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 7, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 8, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 8, 2024
Apart of #15162. Also adds tests for `pylibcudf.filling`.

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #17277
rapids-bot bot pushed a commit that referenced this issue Nov 8, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 16, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 21, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 26, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 4, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 6, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 6, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 6, 2024
…olumn.from_libcudf` (#17517)

Apart of #15162. In a follow-up PR we'll deprecate the cudf python column APIs and others that are used outside cudf.

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #17517
@vyasr
Copy link
Contributor Author

vyasr commented Dec 19, 2024

Closing as complete. pylibcudf has basically all of the algorithms that cudf, cudf-polars, and cudf.pandas need now. The remaining work will be at a higher level of pylibcudf to ensure it can be used directly.

@vyasr vyasr closed this as completed Dec 19, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Status: No status
Development

No branches or pull requests

3 participants