Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy blobs with AD auth #25478

Closed
tamirkamara opened this issue Jul 31, 2022 · 4 comments
Closed

Copy blobs with AD auth #25478

tamirkamara opened this issue Jul 31, 2022 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Storage Storage Service (Queues, Blobs, Files)

Comments

@tamirkamara
Copy link

tamirkamara commented Jul 31, 2022

  • Package Name: azure.storage.blob
  • Package Version: 12.12.0
  • Operating System: Linux
  • Python Version: 3.8

Describe the bug
Copying blobs between accounts doesn't work with AD Auth - returns error: CannotVerifyCopySource.
I tried other tools like azcopy and saw that this same action works correctly there (without any download/reupload or accessing the account keys to generate SAS).

To Reproduce
Steps to reproduce the behavior:

  1. Create 2 storage accounts, and upload a file in the first account (via the Azure portral).
  2. Give your identity/service principal Storage Blob Data Contributor on both accounts
  3. Then run this code:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

credential = DefaultAzureCredential()

container_name = "abc"
blob_name = "myfile_1M"
source_account_url = "https://somesource.blob.core.windows.net/"
destination_account_url = "https://somedestination.blob.core.windows.net/"

source_blob_service_client = BlobServiceClient(account_url=source_account_url, credential=credential)
source_container_client = source_blob_service_client.get_container_client(container_name)
source_blob = source_container_client.get_blob_client(blob_name)
source_blob_url = source_blob.url

destination_blob_service_client = BlobServiceClient(account_url=destination_account_url, credential=credential)
destination_blob = destination_blob_service_client.get_blob_client(container_name, source_blob.blob_name)
copy = destination_blob.start_copy_from_url(source_blob_url)

print("done")
  1. Get an CannotVerifyCopySource error

Expected behavior
The copy operation should be submitted successfully (just as if I was using SAS)

@github-actions github-actions bot added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jul 31, 2022
@azure-sdk azure-sdk added Client This issue points to a problem in the data-plane of the library. needs-team-triage Workflow: This issue needs the team to triage. Storage Storage Service (Queues, Blobs, Files) labels Jul 31, 2022
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jul 31, 2022
@xiangyan99 xiangyan99 removed the needs-team-triage Workflow: This issue needs the team to triage. label Aug 1, 2022
@jalauzon-msft
Copy link
Member

jalauzon-msft commented Aug 2, 2022

Hi @tamirkamara Tamir, thanks for reaching out.

When doing copy operations using AAD with the Python SDK, you need to generate the OAuth token yourself and provide it to the source_authorization keyword argument. Here is a sample based on your sample:

from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

credential = DefaultAzureCredential()

...

# Requesting a token to the Storage resource
token = "Bearer {}".format(credential.get_token("https://storage.azure.com/.default").token)
# Note: OAuth copy is only supported currently for sync copy so you need requires_sync=True
copy = destination_blob.start_copy_from_url(source_blob_url, source_authorization=token, requires_sync=True)

Hopefully this helps resolve your issue but please let me know if it does not. Thanks.

@tamirkamara
Copy link
Author

Thanks @jalauzon-msft.
That requires_sync requirement is a downside since I deal with large files and not sure the caller can wait...
Any plans/work to remove that requirement?

@jalauzon-msft
Copy link
Member

Hi @tamirkamara Tamir, unfortunately this is a service limitation. For some reason only the Copy Blob From URL operations support AAD auth while Copy Blob does not. This is sort of what the requires_sync parameter controls.

I think this may actually be a more severe limitation for you since it actually only supports blobs up to 256 MiB. If you are trying to copy a blob larger, there does not seem to be a way to do it with AAD auth currently using, standard copy operations.

AzCopy uses a different mechanism for copying blobs where it uses a series of Put Block from URL calls and then a final Put Block List. Put Block from URL supports AAD auth (via the same source_authorization and this is why this works for AzCopy. We don't currently offer this form of copy in the SDK, but it would be possible to implement yourself. We've also been considering adding something like this form of copy in Python but have no ETA on when that would be available.

azure-sdk pushed a commit to azure-sdk/azure-sdk-for-python that referenced this issue Sep 20, 2023
[Hub Generated] Review request for Microsoft.ContainerService/aks to add version stable/2023-08-01 (Azure#25633)

* Adds base for updating Microsoft.ContainerService/aks from version stable/2023-07-01 to version 2023-08-01

* Updates readme

* Updates API version in new specs and examples

* update (Azure#25468)

* Adding Azure Service Mesh and related changes. (Azure#25482)

* Adding Azure Service Mesh and related changes.

* Fix a typo with api version in examples

* fix a bug with egressGayeways for Istio

* add resourceuid field to managed cluster properties (Azure#25478)

* Adding examples for AzureServiceMesh (Azure#25535)

* Adding an example for AzureServiceMesh

* remove unneeded adonprofiles object

* Adding ingressGateway to the ASM example.

* Adding egressGateways to IstioComponents

* Add egressGateways to the request payload.

* Add BYO CA object to ServiceMeshProfile example

* Fix resource id format for Azure keyvault

* Enrich ASM examples (Azure#25597)

* add asm- prefix to upgrades field on mesh ops (Azure#25601)

---------

Co-authored-by: deveshdama <87668846+deveshdama@users.noreply.github.com>
Co-authored-by: daru__ <ptd2108@columbia.edu>
Co-authored-by: Sanya Kochhar <42152676+SanyaKochhar@users.noreply.github.com>
Copy link

Hi @tamirkamara, we deeply appreciate your input into this project. Regrettably, this issue has remained unresolved for over 2 years and inactive for 30 days, leading us to the decision to close it. We've implemented this policy to maintain the relevance of our issue queue and facilitate easier navigation for new contributors. If you still believe this topic requires attention, please feel free to create a new issue, referencing this one. Thank you for your understanding and ongoing support.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 31, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

5 participants