Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][FS][Azure] SAS token authentication #44308

Closed
Tom-Newton opened this issue Oct 4, 2024 · 6 comments
Closed

[C++][FS][Azure] SAS token authentication #44308

Tom-Newton opened this issue Oct 4, 2024 · 6 comments

Comments

@Tom-Newton
Copy link
Contributor

Describe the enhancement requested

Child of #38598

Add support for Azure CLI auth. Probably just accept the SAS token as an argument and use AzureSasCredential https://github.com/Azure/azure-sdk-for-cpp/blob/101f20f2bbf3dd5f6438565cd9f709a231317f77/sdk/tables/azure-data-tables/inc/azure/data/tables/credentials/azure_sas_credential.hpp#L16C9-L16C27. This should make the implementation very similar to all the other Azure auths.

One possible complication I'm aware of is in CopyFile, because this is implemented by generating a federated SAS token from whatever the original authentication was. I don't know if its possible to get a federated SAS token when using SAS token auth.

Also I'm not sure we need to generate a federated SAS token to implement CopyFile. I think we should be able to use Copy Blob instead of Copy Blob From URL. I think the latter is only needed if the source is not in the same Azure blob storage account.

Component(s)

C++

@Tom-Newton
Copy link
Contributor Author

take

@Tom-Newton
Copy link
Contributor Author

Tom-Newton commented Dec 12, 2024

I have it working to some extent, but as I feared there is a complexity with CopyFile because its implemented by creating a SAS token at runtime #41276. When authenticated with a SAS token already we cannot generate a new SAS token in the current way (I tested that it does work with Azure CLI auth so the problem is not just broken code that we still need to test #41315).

There are 2 options:

  1. Stop using SAS token inside CopyFile
    1. With azcopy a SAS token is only needed when doing a copy from one Azure storage account to another Azure storage account. Within a single storage account we should be able to use the Copy Blob API instead of Copy Blob From URL but I'm not sure if this is actually exposed by the C++ SDK.
    2. I'm a bit concerned by the current implementation that generates a new SAS token for every blob. Azure tends to have tight rate limits on APIs to generate authentication tokens, so I suspect a call to CopyFiles on a large directory would have problems.
    3. We can close [C++][FS][Azure] Test CopyFile() with non account key credential #41315
  2. Use the SAS token we already have
    1. Definitely possible.

@kou
Copy link
Member

kou commented Dec 13, 2024

The option 1 is better, right? Let's try the option 1.

@Tom-Newton
Copy link
Contributor Author

The option 1 is better, right? Let's try the option 1.

I thought so too but, after trying it, now I'm not so sure. I checked a few of the Azure SDKs (C++, Python, and golang) and it actually looks like none of them expose the Copy Blob API.

I found the SourceAuthorization option which should allow configuring different authentication to the source when using Copy Blob From URL. Plus some some examples of how to use it in azcopy. There also seem to be a string of changes in azcopy to try to get this working correctly, because of problems like making too many token requests or using a cached token after it expired. The current implementation adds a custom http policy when creating the Azure client, which intercepts requests that include an empty x-ms-copy-source-authorization and populate it https://github.com/Azure/azure-storage-azcopy/blob/7f5ab059c28b09841d009d28b7510d4fcb167dc8/ste/sourceAuthPolicy.go#L47-L77.

This is all looking rather complicated and I don't think we can generate the required bearer tokens with SAS or account key auth. So I think I'll stick with generating SAS tokens, and I think I will just bump their expiry a bit as a simple mitigation for the problem mentioned in azcopy where tokens expired during retries.

@Tom-Newton
Copy link
Contributor Author

Ok, I think I finally worked it out StartCopyFromUri in the Azure C++ SDK can do copies within the same storage account any extra authentication required.

kou pushed a commit that referenced this issue Dec 17, 2024
### Rationale for this change
SAS token auth is sometimes useful and it the last one we haven't implemented. 

### What changes are included in this PR?
- Implement `ConfigureSasCredential`
- Update `AzureOptions::FromUri` so that simply appending a SAS token to a blob storage URI works. e.g. `AzureOptions::FromUri("abfs://file_system@ account.dfs.core.windows.net/?se=2024-12-12T18:57:47Z&sig=pAs7qEBdI6sjUhqX1nrhNAKsTY%2B1SqLxPK%2BbAxLiopw%3D&sp=racwdxylti&spr=https,http&sr=c&sv=2024-08-04")`
  - SAS tokens are made up of a bunch of URI query parameters that I'm not sure we can exhaustively list.
  - Therefore we now assume that any unrecognised URI query parameters are assumed to be part of a SAS token, instead of returning an error status. 
- Update `CopyFile` to use StartCopyFromUri instead of CopyFromUri
  - This avoids the need to generate SAS tokens. 
  - Supports blobs bigger than 256MiB
  - This makes #41315 redundant
 
### Are these changes tested?
Yes
- Added new tests for authenticating with SAS and doing some operations including `CopyFile`
- Added new tests for `AzureOptions::FromUri` with a SAS token. 

I also made sure to run the tests which connect to real blob storage. 

### Are there any user-facing changes?
- SAS token in now supported
- Unrecognised URI query parameters are ignored by `AzureOptions::FromUri` instead of failing fast. IMO this is a regression but still the best option to support SAS token. 

* GitHub Issue: #44308

Authored-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 19.0.0 milestone Dec 17, 2024
@kou
Copy link
Member

kou commented Dec 17, 2024

Issue resolved by pull request 45021
#45021

@kou kou closed this as completed Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants