-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][FS][Azure] Implement Move() for flat namespace storage accounts #40025
Comments
cc @felipecrv - you mentioned that you planned to work on this. Assuming #40021 is completed first, then this ticket should include enabling the python side tests for move. I will add |
If you can pass the tests on a real HNS storage account, it should be the same test code running against a different environment. The semantics should be the same. I don't plan to add new test cases in |
That all makes sense but I think we are talking cross purposes. I'm not proposing we add new tests. The python tests are generic across all filesystems and #40021 just enables them to run against azurite. However it currently it disables a couple of tests because move is not supported on azurite. I don't think its worth running the python tests against real blob storage. My point is that the disabled tests should be re-enabled when we add support for move on flat namespace |
I would be grateful if you ran these tests manually against a real storage account with HNS support to catch any issues with the tests themselves. If they pass on the HNS implementation with |
Turns out this was definitely a good idea. I was expecting everything to pass but there where actually a lot of failures
Most of them are failed to create directory which I think boils down to |
After sorting the obvious ones (#40052). The move related failures are FAILED pyarrow/tests/test_fs.py::test_copy_file[AzureFileSystem]
FAILED pyarrow/tests/test_fs.py::test_move_directory[AzureFileSystem]
FAILED pyarrow/tests/test_fs.py::test_move_file[AzureFileSystem]
|
…ystem` (#40021) ### Rationale for this change We want to use the new `AzureFileSystem` in `pyarrow`. ### What changes are included in this PR? - Add minimal python bindings for `AzureFileSystem`. This includes just enough to run the python tests against azurite plus default credential auth to enable real use of this once this PR merges. - Adding additional configuration options and remaining authentication options can be done as a follow up. - I tried to copy the existing pybinds for GCS and S3 - Explicitly set `ARROW_AZURE=OFF` rather than relying on defaults. The defaults are different for builds vs tests so this was causing tests to be enabled while Azure was disabled during the build. ### Are these changes tested? Enabled the the python filesystem tests for the new filesystem. I had to skip azure in a couple of the tests though because they are not yet working on the C++ side. I created Github issues to resolve these #40025 and #40026 and added TODO comments where relevant, that reference these Github issues. ### Are there any user-facing changes? `pyarrow` users can now use the native `AzureFileSystem` to get much better reliability and performance compared to `adlfs` based options. * Closes: #39968 * GitHub Issue: #39968 Lead-authored-by: Thomas Newton <thomas.w.newton@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Describe the enhancement requested
#38704 implemented
Move
on hierarchical namespace storage accounts but we sill need to implement it for flat namespace storage accounts.Supporting both file and directory moves would be nice but to achieve parity with GCS and S3 filesystems we only need to support file moves.
This is a child of #18014.
Component(s)
C++
The text was updated successfully, but these errors were encountered: