Azure Data Lake connection will not work for blob.core.windows.net domain #44228
Labels
area:providers
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
provider:microsoft-azure
Azure-related issues
Apache Airflow Provider(s)
microsoft-azure
Versions of Apache Airflow Providers
apache-airflow-providers-microsoft-azure 11.1.0
Apache Airflow version
2.9.2
Operating System
Ubuntu 22.04.4
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
Scenario: need to leverage Azure storage for Airflow remote logging.
Step 1 is verifying the connection works, so I'm using the operator ADLSListOperator as a test case.
On the connector I have set the following properties:
Azure Client ID:
Azure Client Secret:
Azure Tenant ID:
Azure DataLake Store Name: <e.g. mystorageaccount>
The store name's fully qualified url is https://mystorageaccount.blob.core.windows.net/
I know the client id, secret, and tenant id are all valid. They match the credentials that successfully work against the storage account using the python operator and the azure.storage.blob library. If I try to leverage the ADLS Connection with ADLSListOperator from apache-airflow-providers-microsoft-azure (11.1.0), it fails. The error log seems to indicate it is trying to connect to the wrong domain - e.g. ConnectionError(MaxRetryError("HTTPSConnectionPool(host='none.azuredatalakestore.net'
The domain azuredatalakestore.net is for legacy azure storage accounts. New storage accounts cannot use this domain. All future storage accounts use blob.core.windows.net.
If anyone has successfully used the operator ADLSListOperator against a storage account hosted at blob.core.windows.net, I'd be curious to know the configuration used. The documentation and examples I've found are very sparse or inconsistent.
I've tried using connector types azure_data_lake (as described above) as well as types adls and wasb.
What you think should happen instead
I would exect ADLSListOperator to list files, but it times out. I assume because it is trying to connect to the wrong domain.
How to reproduce
Anything else
Always. Hasn't worked successfully yet.
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: