Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements on data source and data sink #17764

Draft
wants to merge 5 commits into
base: branch-25.02
Choose a base branch
from

Conversation

kingcrimsontianyu
Copy link
Contributor

@kingcrimsontianyu kingcrimsontianyu commented Jan 17, 2025

Description

This PR is slated for 25.04.
This PR is WIP.

This PR makes the following improvements on I/O:

  • Remove legacy cuFile integration to simplify code maintenance. Use KvikIO to manage the GDS setting and compatibility mode.
  • Remove file utility classes and functions. Use KvikIO for all file-related operations.
  • Replace in-house implementation of host_read (for file_source) and host_write (for file_sink) with KvikIO's parallel counterpart.

Closes #16418

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@kingcrimsontianyu kingcrimsontianyu added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function breaking Breaking change labels Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu self-assigned this Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu requested review from a team as code owners January 17, 2025 20:09
@github-actions github-actions bot added the CMake CMake build issue label Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu marked this pull request as draft January 17, 2025 20:09
Copy link

copy-pr-bot bot commented Jan 17, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu changed the title Remove legacy cuFile integration I/O improvements Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu changed the title I/O improvements Improvements on data source and data sink Jan 18, 2025
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu force-pushed the remove-legacy-cufile-integration branch from 1d2ed1e to 2f045b4 Compare January 18, 2025 13:35
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Remove the cuFile (GDS) backend
1 participant