Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add aad authentication support for cognitive services #1778

Merged
merged 15 commits into from
Jan 11, 2023

Conversation

serena-ruan
Copy link
Contributor

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Briefly describe the changes included in this Pull Request.

How is this patch tested?

  • I have written tests (not required for typo or doc fix) and confirmed the proposed feature/bug-fix/change works.

Does this PR change any dependencies?

  • No. You can skip this section.
  • Yes. Make sure the dependencies are resolved correctly, and list changes here.

Does this PR add a new feature? If so, have you added samples on website?

  • No. You can skip this section.
  • Yes. Make sure you have added samples following below steps.
  1. Find the corresponding markdown file for your new feature in website/docs/documentation folder.
    Make sure you choose the correct class estimators/transformers and namespace.
  2. Follow the pattern in markdown file and add another section for your new API, including pyspark, scala (and .NET potentially) samples.
  3. Make sure the DocTable points to correct API link.
  4. Navigate to website folder, and run yarn run start to make sure the website renders correctly.
  5. Don't forget to add <!--pytest-codeblocks:cont--> before each python code blocks to enable auto-tests for python samples.
  6. Make sure the WebsiteSamplesTests job pass in the pipeline.

@github-actions
Copy link

Hey @serena-ruan 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented Dec 26, 2022

Codecov Report

Merging #1778 (9636e06) into master (dd1563f) will decrease coverage by 10.00%.
The diff coverage is 80.64%.

@@             Coverage Diff             @@
##           master    #1778       +/-   ##
===========================================
- Coverage   86.08%   76.08%   -10.01%     
===========================================
  Files         278      278               
  Lines       14722    14750       +28     
  Branches      767      763        -4     
===========================================
- Hits        12674    11222     -1452     
- Misses       2048     3528     +1480     
Impacted Files Coverage Δ
...oft/azure/synapse/ml/cognitive/openai/OpenAI.scala 78.26% <0.00%> (-0.49%) ⬇️
...zure/synapse/ml/cognitive/search/AzureSearch.scala 9.28% <0.00%> (-78.49%) ⬇️
...rosoft/azure/synapse/ml/geospatial/Geocoders.scala 18.00% <0.00%> (-77.84%) ⬇️
...microsoft/azure/synapse/ml/codegen/Wrappable.scala 99.27% <ø> (-0.73%) ⬇️
...re/synapse/ml/cognitive/CognitiveServiceBase.scala 82.08% <92.59%> (+1.54%) ⬆️
...t/azure/synapse/ml/automl/DefaultHyperparams.scala 0.00% <0.00%> (-100.00%) ⬇️
...azure/synapse/ml/vw/featurizer/MapFeaturizer.scala 0.00% <0.00%> (-100.00%) ⬇️
...azure/synapse/ml/vw/featurizer/SeqFeaturizer.scala 0.00% <0.00%> (-100.00%) ⬇️
...re/synapse/ml/vw/featurizer/StringFeaturizer.scala 0.00% <0.00%> (-100.00%) ⬇️
...napse/ml/vw/featurizer/StringSplitFeaturizer.scala 0.00% <0.00%> (-100.00%) ⬇️
... and 70 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -132,14 +132,78 @@ trait HasSubscriptionKey extends HasServiceParams {

def getSubscriptionKey: String = getScalarParam(subscriptionKey)

def setSubscriptionKey(v: String): this.type = setScalarParam(subscriptionKey, v)
def setSubscriptionKey(v: String): this.type = {
println("WARNING: Please use setSubscriptionKey together with setLocation for authentication")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we will always emit "WARNING" if they set this? No other way to give this information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have better ideas? Because most users are quite confused what methods should be used together, like many of them want to setUrl but that's not what we want, we should find an elegant way to warn them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you know the code better than me. :) As a user, I get annoyed by WARNINGS in my logs even if I'm using it right.

My only thoughts:

  1. add this info to doc strings? Or are the HasX traits too generic to do that? Can you override doc strings or params if you are specifying a required combination of params needed?
  2. validate at transform() time and throw if a proper combination is not set? Is that possible? As a user, this would be useful to me, just to see an error telling me what I did wrong if I didn't set it correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I totally agree, it might be annoying. Then I found out that for different cog services the usages are also not consistent, like most of them expects setLocation, but there're some expecting setServiceName instead, and for some other cases like DEP they might have to setUrl explicitly. In this case I'll remove the change in this PR and maybe in the future come up with better documentation in general.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to warn, use the official log4j warning APIs please

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan
Copy link
Contributor Author

@mhamilton723 Could you review this? 😺


def getSubscriptionKeyCol: String = getVectorParam(subscriptionKey)

def setSubscriptionKeyCol(v: String): this.type = setVectorParam(subscriptionKey, v)

}

trait HasAADToken extends HasServiceParams {
val aadToken = new ServiceParam[String](
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: AAD might be better than aad here

Copy link
Collaborator

@mhamilton723 mhamilton723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome! We should add a check for this trait in fizzing tests (can do this later). Anything with has subscription key trait should have aad trait too. Also I think we should test this with an aad token we generate from credentials we keep in a key vault or something

Copy link
Collaborator

@mhamilton723 mhamilton723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the fuzzing test, i think the only thing we need is a single actual terst of this AAD token to verify that it works for at least one service

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723 mhamilton723 merged commit dc57dea into microsoft:master Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants