Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Authenticate Using Managed Identity #373

Open
giantonius opened this issue Apr 11, 2024 · 3 comments
Open

Unable to Authenticate Using Managed Identity #373

giantonius opened this issue Apr 11, 2024 · 3 comments

Comments

@giantonius
Copy link

Describe the bug
Tried to follow the example in https://github.com/Azure/azure-kusto-spark/blob/master/docs/Authentication.md#managed-identity-authentication to authenticate using managed identity. Experienced multiple issues when trying to follow example. I am running the code in Azure Databricks environment and have downloaded the package (com.microsoft.azure.kusto:kusto-spark_3.0_2.12:5.0.6).

To Reproduce
Steps to reproduce the behavior:

NameError: name 'KustoSinkOptions' is not defined when running the below code snippet:

df.write.format("com.microsoft.kusto.spark.datasource")
.option(KustoSinkOptions.KUSTO_CLUSTER, "baseplatform.westus")
.option(KustoSinkOptions.KUSTO_DATABASE, "WHEA")
.option(KustoSinkOptions.KUSTO_TABLE, "TestManagedId")
.option(KustoSinkOptions.KUSTO_MANAGED_IDENTITY_AUTH, true.toString)
.option(KustoSinkOptions.KUSTO_MANAGED_CLIENT_ID, "xxxx")
.mode(SaveMode.Append)
.save()

java.security.InvalidParameterException: KUSTO_DATABASE parameter is missing. Must provide a destination database name when running the below code snippets:

df.write.format("com.microsoft.kusto.spark.datasource")
.option("KustoSinkOptions.KUSTO_CLUSTER", "baseplatform.westus")
.option("KustoSinkOptions.KUSTO_DATABASE", "WHEA")
.option("KustoSinkOptions.KUST_TABLE", "TestManagedId")
.option("KustoSinkOptions.KUSTO_MANAGED_IDENTITY_AUTH", True)
.option(KustoSinkOptions.KUSTO_MANAGED_CLIENT_ID, "xxxx")
.mode("Append")
.save()

df.write.format("com.microsoft.kusto.spark.datasource")
.option("KUSTO_CLUSTER", "baseplatform.westus")
.option("KUSTO_DATABASE", "WHEA")
.option("KUSTO_TABLE", "TestManagedId")
.option("KUSTO_MANAGED_IDENTITY_AUTH", True)
.option("KUSTO_MANAGED_CLIENT_ID", "xxxx")
.mode("Append")
.save()

@ag-ramachandran
Copy link
Contributor

Hi @giantonius , While we'll check the code options for any bugs , you may want to check if ADB supports propagation of ManagedIdentity.

As for ADB, I will check how we can test with : https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/azure-mi and see if there something that needs fixing

@ag-ramachandran
Copy link
Contributor

ag-ramachandran commented Apr 15, 2024

Hello @giantonius

Sorry, did not look at it more closely. Both the code snippets have minor mistakes

import com.microsoft.kusto.spark.datasink.KustoSinkOptions // you have to import manually

df.write.format("com.microsoft.kusto.spark.datasource")
.option(KustoSinkOptions.KUSTO_CLUSTER, "baseplatform.westus")
.option(KustoSinkOptions.KUSTO_DATABASE, "WHEA")
.option(KustoSinkOptions.KUSTO_TABLE, "TestManagedId")
.option(KustoSinkOptions.KUSTO_MANAGED_IDENTITY_AUTH, true.toString)
.option(KustoSinkOptions.KUSTO_MANAGED_CLIENT_ID, "xxxx")
.mode(SaveMode.Append)
.save()

This is wrong, as the constants are enclosed in Quotes

df.write.format("com.microsoft.kusto.spark.datasource")
.option("KustoSinkOptions.KUSTO_CLUSTER", "baseplatform.westus")
.option("KustoSinkOptions.KUSTO_DATABASE", "WHEA")
.option("KustoSinkOptions.KUST_TABLE", "TestManagedId")
.option("KustoSinkOptions.KUSTO_MANAGED_IDENTITY_AUTH", True)
.option(KustoSinkOptions.KUSTO_MANAGED_CLIENT_ID, "xxxx")
.mode("Append")
.save()

If you want to use literals as in option-3 , then these options have to be changed. These are from KustoOptions/KustoSinkOptions classes in this repo

df.write.format("com.microsoft.kusto.spark.datasource")
.option("kustoCluster", "baseplatform.westus")
.option("kustoDatabase", "WHEA")
.option("kustoTable", "TestManagedId")
.option("managedIdentityAuth", True)
.option("managedIdentityClientId", "xxxx")
.mode("Append")
.save()

Please use (1) or (3) , it should go through. Note that your ADB has to support ManagedIdentity, that is outside the scope of this connector

@giantonius
Copy link
Author

giantonius commented Apr 15, 2024

Hi @ag-ramachandran , I tried the options 1 and 3, and got these errors:

Option 1 error: ModuleNotFoundError: No module named 'com.microsoft'

import com.microsoft.kusto.spark.datasink.KustoSinkOptions

df.write.format("com.microsoft.kusto.spark.datasource") \
    .option(KustoSinkOptions.KUSTO_CLUSTER, "baseplatform.westus") \
    .option(KustoSinkOptions.KUSTO_DATABASE, "WHEA") \
    .option(KustoSinkOptions.KUSTO_TABLE, "TestManagedId") \
    .option(KustoSinkOptions.KUSTO_MANAGED_IDENTITY_AUTH, true.toString) \
    .option(KustoSinkOptions.KUSTO_MANAGED_CLIENT_ID, "xxxx") \
    .mode(SaveMode.Append) \
    .save()

I have installed com.microsoft.azure.kusto:kusto-spark_3.0_2.12:5.0.6 on my Spark cluster on Azure Databricks

Option 3 error: IllegalArgumentException: scopes is null or empty

df.write.format("com.microsoft.kusto.spark.datasource") \
    .option("kustoCluster", "baseplatform.westus") \
    .option("kustoDatabase", "WHEA") \
    .option("kustoTable", "TestManagedId") \
    .option("managedIdentityAuth", True) \
    .option("managedIdentityClientId", "xxxx") \
    .mode("Append") \
    .save()

Additionally, can you clarify "your ADB has to support ManagedIdentity, that is outside the scope of this connector"? Do you have additional resources to set up managed identity support in ADB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants