Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][FS][Azure] Implement Move() #38704

Closed
kou opened this issue Nov 14, 2023 · 7 comments · Fixed by #39904
Closed

[C++][FS][Azure] Implement Move() #38704

kou opened this issue Nov 14, 2023 · 7 comments · Fixed by #39904

Comments

@kou
Copy link
Member

kou commented Nov 14, 2023

Describe the enhancement requested

It's not implemented yet.

This is a child of GH-18014.

Component(s)

C++

@Tom-Newton
Copy link
Contributor

Tom-Newton commented Nov 20, 2023

This is a potential complication with move. Azure blob storage does support efficient server side only moves.... but only if using SAS token authentication for the source file. Alternatively everything will need to be downloaded and re-uploaded, but that should work with any authentication.

@nosterlu
Copy link

Maybe copy and then delete? That is at least how I have solved it previously with adlfs and Azure 😅

@Tom-Newton
Copy link
Contributor

Tom-Newton commented Nov 23, 2023

Sorry, I don't think I was thinking very carefully when I posted that comment.

I think move is efficiently supported with hierarchical namespace. E.g. using https://github.com/Azure/azure-sdk-for-cpp/blob/4a32d7266cfac8bfc0eb87feb56011361a36f43c/sdk/storage/azure-storage-files-datalake/inc/azure/storage/files/datalake/datalake_file_system_client.hpp#L239

What I was thinking of was the copy blob by URI which has to use SAS token auth. But if we're only copying within one storage account we don't need to use that. There is another copy blob API we can use https://learn.microsoft.com/en-us/rest/api/storageservices/copy-blob?tabs=microsoft-entra-id#authorization. So I don't think there is any auth complication. We can use copy followed by delete as you say 🙂.

@felipecrv felipecrv self-assigned this Jan 9, 2024
@felipecrv
Copy link
Contributor

@kou if you haven't started to work on this one I'm happy to take it.

@kou
Copy link
Member Author

kou commented Jan 10, 2024

I haven't. Please do it!

@av8or1
Copy link
Contributor

av8or1 commented Jan 30, 2024

felipecrv - Did you begin work on this one?

@felipecrv
Copy link
Contributor

felipecrv - Did you begin work on this one?

Yes. I'm very close to sending a PR.

felipecrv added a commit that referenced this issue Feb 10, 2024
…Storage Gen 2 API (#39904)

### Rationale for this change

We need to move directories and files via the `arrow::FileSystem` interface.

### What changes are included in this PR?

 - A few filesystem error reporting improvements
 - A helper class to deal with Azure Storage leases [1]
 - The `Move()` implementation that can move files and directories within the same container on storage accounts with Hierarchical Namespace Support enabled
 - Lots of tests

[1]: https://learn.microsoft.com/en-us/rest/api/storageservices/lease-blob

### Are these changes tested?

Yes, by existing and a huge number of tests added by this PR. The test code introduced here should be extracted to a reusable test module that we can use to test move in other file system implementations.

### Are there any user-facing changes?

No breaking changes, only new functionality.
* Closes: #38704

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
@felipecrv felipecrv added this to the 16.0.0 milestone Feb 10, 2024
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…aLake Storage Gen 2 API (apache#39904)

### Rationale for this change

We need to move directories and files via the `arrow::FileSystem` interface.

### What changes are included in this PR?

 - A few filesystem error reporting improvements
 - A helper class to deal with Azure Storage leases [1]
 - The `Move()` implementation that can move files and directories within the same container on storage accounts with Hierarchical Namespace Support enabled
 - Lots of tests

[1]: https://learn.microsoft.com/en-us/rest/api/storageservices/lease-blob

### Are these changes tested?

Yes, by existing and a huge number of tests added by this PR. The test code introduced here should be extracted to a reusable test module that we can use to test move in other file system implementations.

### Are there any user-facing changes?

No breaking changes, only new functionality.
* Closes: apache#38704

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this issue Feb 28, 2024
…aLake Storage Gen 2 API (apache#39904)

### Rationale for this change

We need to move directories and files via the `arrow::FileSystem` interface.

### What changes are included in this PR?

 - A few filesystem error reporting improvements
 - A helper class to deal with Azure Storage leases [1]
 - The `Move()` implementation that can move files and directories within the same container on storage accounts with Hierarchical Namespace Support enabled
 - Lots of tests

[1]: https://learn.microsoft.com/en-us/rest/api/storageservices/lease-blob

### Are these changes tested?

Yes, by existing and a huge number of tests added by this PR. The test code introduced here should be extracted to a reusable test module that we can use to test move in other file system implementations.

### Are there any user-facing changes?

No breaking changes, only new functionality.
* Closes: apache#38704

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
thisisnic pushed a commit to thisisnic/arrow that referenced this issue Mar 8, 2024
…aLake Storage Gen 2 API (apache#39904)

### Rationale for this change

We need to move directories and files via the `arrow::FileSystem` interface.

### What changes are included in this PR?

 - A few filesystem error reporting improvements
 - A helper class to deal with Azure Storage leases [1]
 - The `Move()` implementation that can move files and directories within the same container on storage accounts with Hierarchical Namespace Support enabled
 - Lots of tests

[1]: https://learn.microsoft.com/en-us/rest/api/storageservices/lease-blob

### Are these changes tested?

Yes, by existing and a huge number of tests added by this PR. The test code introduced here should be extracted to a reusable test module that we can use to test move in other file system implementations.

### Are there any user-facing changes?

No breaking changes, only new functionality.
* Closes: apache#38704

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants