Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate Pod Identity testing #814

Closed
chlowell opened this issue Nov 27, 2019 · 24 comments
Closed

Automate Pod Identity testing #814

chlowell opened this issue Nov 27, 2019 · 24 comments
Labels
Azure.Identity Client This issue points to a problem in the data-plane of the library. EngSys This issue is impacting the engineering system. test-manual-pass

Comments

@chlowell
Copy link
Member

chlowell commented Nov 27, 2019

Pod Identity is now deprecated

https://github.com/Azure/aad-pod-identity#aad-pod-identity-deprecated

use workload identity instead


We need to automate testing our identity libraries' managed identity implementations with pod identity (AKS).

Below are instructions for manually testing Python's implementation (as seen in the repo):

Testing managed identity in Azure Kubernetes Service

prerequisite tools

Azure resources

This test requires instances of these Azure resources:

  • Azure Key Vault
  • Azure Managed Identity
    • with secrets/set and secrets/delete permission for the Key Vault
  • Azure Container Registry
  • Azure Kubernetes Service
    • RBAC requires additional configuration not provided here, so an RBAC-disabled cluster is preferable
    • the cluster's service principal must have 'Managed Identity Operator' role over the managed identity
    • must be able to pull from the Container Registry

The rest of this section is a walkthrough of deploying these resources.

set environment variables to simplify copy-pasting

  • RESOURCE_GROUP
    • name of an Azure resource group
    • must be unique in the Azure subscription
    • e.g. 'pod-identity-test'
  • AKS_NAME
    • name of an Azure Kubernetes Service
    • must be unique in the resource group
    • e.g. 'pod-identity-test'
  • ACR_NAME
    • name of an Azure Container Registry
    • 5-50 alphanumeric characters
    • must be globally unique
  • MANAGED_IDENTITY_NAME
    • 3-128 alphanumeric characters
    • must be unique in the resource group
  • KEY_VAULT_NAME
    • 3-24 alphanumeric characters
    • must begin with a letter
    • must be globally unique

resource group

az group create -n $RESOURCE_GROUP --location westus2

managed identity

Create the managed identity:

az identity create -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME

Save its clientId, id (ARM URI), and principalId (object ID) for later:

export MANAGED_IDENTITY_CLIENT_ID=$(az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query clientId -o tsv) \
       MANAGED_IDENTITY_ID=$(az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query id -o tsv) \
       MANAGED_IDENTITY_PRINCIPAL_ID=$(az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query principalId -o tsv)

Key Vault

Create the Vault:

az keyvault create -g $RESOURCE_GROUP -n $KEY_VAULT_NAME --sku standard

Add an access policy for the managed identity:

az keyvault set-policy -n $KEY_VAULT_NAME --object-id $MANAGED_IDENTITY_PRINCIPAL_ID --secret-permissions list

container registry

az acr create -g $RESOURCE_GROUP -n $ACR_NAME --admin-enabled --sku basic

Kubernetes

Deploy the cluster (this will take several minutes):

az aks create -g $RESOURCE_GROUP -n $AKS_NAME --generate-ssh-keys --node-count 1 --disable-rbac --attach-acr $ACR_NAME

Grant the cluster's service principal permission to use the managed identity:

az role assignment create --role "Managed Identity Operator" \
  --assignee $(az aks show -g $RESOURCE_GROUP -n $AKS_NAME --query servicePrincipalProfile.clientId -o tsv) \
  --scope $MANAGED_IDENTITY_ID

build images

The test application must be packaged as a Docker image before deployment.
Test runs must include Python 2 and 3, so two images are required.

authenticate to ACR

az acr login -n $ACR_NAME

acquire the test code

git clone https://github.com/Azure/azure-sdk-for-python/ --branch master --single-branch --depth 1

The rest of this section assumes this working directory:

cd azure-sdk-for-python/sdk/identity/azure-identity/tests

build images and push them to the container registry

Set environment variables:

export REPOSITORY=$ACR_NAME.azurecr.io IMAGE_NAME=test-pod-identity PYTHON_VERSION=2.7

Build an image:

docker build --no-cache --build-arg PYTHON_VERSION=$PYTHON_VERSION -t $REPOSITORY/$IMAGE_NAME:$PYTHON_VERSION ./managed-identity-live

Push it to ACR:

docker push $REPOSITORY/$IMAGE_NAME:$PYTHON_VERSION

Then set PYTHON_VERSION to the latest 3.x (3.8 at time of writing) and run the
above docker build and docker push commands again. (It's safe--and faster--
to omit --no-cache from docker build the second time.)

run the test

install kubectl

az aks install-cli

authenticate kubectl and helm

az aks get-credentials -g $RESOURCE_GROUP -n $AKS_NAME

install tiller

helm init --wait

run the test script

Twice. Once with PYTHON_VERSION=2.7, once with PYTHON_VERSION=3.x
(replacing x with the latest Python 3 minor version):

python ./pod-identity/run-test.py \
 --client-id $MANAGED_IDENTITY_CLIENT_ID \
 --resource-id $MANAGED_IDENTITY_ID \
 --vault-url https://$KEY_VAULT_NAME.vault.azure.net \
 --repository $REPOSITORY \
 --image-name $IMAGE_NAME \
 --image-tag $PYTHON_VERSION

delete Azure resources

az group delete -n $RESOURCE_GROUP -y --no-wait
@chlowell chlowell added EngSys This issue is impacting the engineering system. Azure.Identity labels Nov 27, 2019
@kurtzeborn kurtzeborn added Client This issue points to a problem in the data-plane of the library. and removed EngSys This issue is impacting the engineering system. labels Dec 10, 2019
@kurtzeborn
Copy link
Member

@chlowel, it looks like you've got most of this figured out, what assistance do you need from the engineering system team?

@kurtzeborn kurtzeborn added EngSys This issue is impacting the engineering system. and removed Client This issue points to a problem in the data-plane of the library. labels Dec 10, 2019
@chlowell
Copy link
Member Author

I want to replace all the manual steps above with an automated live test. I need a pipeline which builds the artifacts (in particular, Docker images), creates and destroys the Azure resources, runs the test script, and reports results.

@jianghaolu
Copy link
Contributor

jianghaolu commented Dec 20, 2019

For Java

Everything is the same prior to Build Images, with the ACR, AKS, Key Vault, managed identity created, and the managed identity is given secret permissions to the key vault.

Additionally, add get permission to access the key vault:

az keyvault set-policy -n $KEY_VAULT_NAME --object-id $MANAGED_IDENTITY_PRINCIPAL_ID --secret-permissions set delete get

And load the key vault with a secret called secret:

az keyvault secret set --vault-name $KEY_VAULT_NAME -n secret --value "The secret value"

Build Images

The test application must be packaged as a Docker image before deployment.

1. authenticate to ACR

Get credentials:

az acr credential show -n $ACR_NAME -o table

Authenticate with the "USERNAME" and either "PASSWORD":

az acr login -n $ACR_NAME -u <USERNAME> -p <PASSWORD>

2. Create Dockerfile

Create a new directory, e.g. pod-identity-test and then create a Dockerfile underneath with content

FROM alpine/git as clone
RUN git clone https://github.com/Azure/azure-sdk-for-java --single-branch --depth 1 /azure-sdk-for-java

FROM maven:3-jdk-8
COPY --from=clone /azure-sdk-for-java /azure-sdk-for-java
WORKDIR /azure-sdk-for-java

RUN mvn clean install -Dgpg.skip -DskipTests -f eng/code-quality-reports/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -f common/perf-test-core/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/core/azure-core/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/core/azure-core-test/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/core/azure-core-http-netty/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/identity/azure-identity/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/keyvault/azure-security-keyvault-secrets/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/keyvault/azure-security-keyvault-keys/pom.xml
RUN mvn clean install -Dgpg.skip -DskipTests -Dmaven.javadoc.skip=true -f sdk/keyvault/azure-security-keyvault-certificates/pom.xml

CMD [ "mvn", "test", "-Dtest=ManagedIdentityCredentialLiveTest#testMSIEndpoint*", "-f", "sdk/e2e/pom.xml", "-Dgpg.skip", "-am", "-DfailIfNoTests=false" ]

3. Build image and push to the container registry

export REPOSITORY=$ACR_NAME.azurecr.io IMAGE_NAME=test-pod-identity
docker build -t $REPOSITORY/$IMAGE_NAME ./pod-identity-test
docker push $REPOSITORY/$IMAGE_NAME

Run the test

1. install kubectl

az aks install-cli

2. authenticate kubectl

az aks get-credentials -g $RESOURCE_GROUP -n $AKS_NAME

3. Create aad-pod-identity deployment on AKS

kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment.yaml

4. Install the Azure Identity

Save this Kubernetes manifest to a file named aadpodidentity.yaml:

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: aad-pod-identity
spec:
  type: 0
  ResourceID: <value of $MANAGED_IDENTITY_ID>
  ClientID: <value of $MANAGED_IDENTITY_CLIENT_ID>

Replace the placeholders with your user identity values. Save your changes to the file, then create the AzureIdentity resource in your cluster:

kubectl apply -f aadpodidentity.yaml

5. Install the Azure Identity Binding

Save this Kubernetes manifest to a file named aadpodidentitybinding.yaml:

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
  name: demo1-azure-identity-binding
spec:
  AzureIdentity: aad-pod-identity
  Selector: aad-pod-identity-binding

Save your changes to the file, then create the AzureIdentityBinding resource in your cluster:

kubectl apply -f aadpodidentitybinding.yaml

6. Start the test run jobs

Save this Kubernetes manifest to a file named pod-identity-test.yaml:

apiVersion: batch/v1
kind: Job
metadata:
  name: test
  labels:
    app: pod-identity-test
spec:
  backoffLimit: 12  # give up after this many attempts
  ttlSecondsAfterFinished: 3600  # delete the job and its sub-resources after this many seconds
  template:
    metadata:
      labels:
        app: pod-identity-test
        aadpodidbinding: aad-pod-identity-binding
    spec:
      restartPolicy: OnFailure  # ensure we have only one pod, whose logs reflect the last test run
      initContainers:
      - name: wait-for-imds  # wait until IMDS responds before running the test
        image: busybox:1.31
        command: ['sh', '-c', 'wget 169.254.169.254 -T 120']
      containers:
      - name: pod-identity-test
        image: <value of $ACR_NAME>.azurecr.io/test-pod-identity
        imagePullPolicy: Always
        env:
        - name: AZURE_VAULT_URL
          value: "https://<value of $KEY_VAULT_NAME>.vault.azure.net"
        - name: AZURE_CLIENT_ID
          value: "<value of $MANAGED_IDENTITY_CLIENT_ID>"

Finally, save your changes to the file, then start the test run in your cluster:

kubectl apply -f pod-identity-test.yaml

Find out the name of the pod running through

$ kubectl get pod -l app=pod-identity-test
NAME         READY   STATUS      RESTARTS   AGE
test-8vx28   1/1     Running   0          22s

Tail the log of the test run

kubectl logs -f test-8vx28

If the tests succeed, console should print near the end

[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running com.azure.endtoend.identity.ManagedIdentityCredentialLiveTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.55 s - in com.azure.endtoend.identity.ManagedIdentityCredentialLiveTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

@catalinaperalta
Copy link
Member

catalinaperalta commented Dec 31, 2019

For Golang

Follow all of the same steps as JAVA, the only change is in the Dockerfile in step 2 of Build Images. For Go the content of the Dockerfile should be:

FROM alpine/git as clone
RUN git clone https://github.com/Azure/azure-sdk-for-go.git --single-branch --branch track2_cloudshell --depth 1 /azure-sdk-for-go

FROM golang:1.13-alpine
COPY --from=clone /azure-sdk-for-go /azure-sdk-for-go
WORKDIR /azure-sdk-for-go/sdk/azidentity
CMD CGO_ENABLED=0 go test -run TestManagedIdentityCredential_GetTokenInVMLive

Expected output if the test succeeds: PASS

@XuGuang-Yao
Copy link

XuGuang-Yao commented Apr 16, 2020

Hi @chlowell ,

Same problem as Automate VM MSI testing. we need to create a non-soft-delete KV.

az keyvault create -g $RESOURCE_GROUP -n $KEY_VAULT_NAME --sku standard --enable-soft-delete false

If not, the test command will fail with below error.
image

@chlowell
Copy link
Member Author

Yes, soft delete was disabled by default when I wrote these instructions but is now enabled by default for new vaults. The right solution is to change the tests so this doesn't matter. I've opened Azure/azure-sdk-for-python#10879 to do that.

@sophiajt
Copy link

For JavaScript:

Testing azure-identity in Azure Kubernetes

prerequisite tools

Azure resources

This test requires instances of these Azure resources:

  • Azure Key Vault
  • Azure Managed Identity
    • with secrets/set and secrets/delete permission for the Key Vault
  • Azure Container Registry
  • Azure Kubernetes Service
    • RBAC requires additional configuration not provided here, so an RBAC-disabled cluster is preferable
    • the cluster's service principal must have 'Managed Identity Operator' role over the managed identity
    • must be able to pull from the Container Registry

The rest of this section is a walkthrough of deploying these resources.

set environment variables to simplify copy-pasting

  • RESOURCE_GROUP
    • name of an Azure resource group
    • must be unique in the Azure subscription
    • e.g. 'pod-identity-test'
  • AKS_NAME
    • name of an Azure Kubernetes Service
    • must be unique in the resource group
    • e.g. 'pod-identity-test'
  • ACR_NAME
    • name of an Azure Container Registry
    • 5-50 alphanumeric characters
    • must be globally unique
  • MANAGED_IDENTITY_NAME
    • 3-128 alphanumeric characters
    • must be unique in the resource group
  • KEY_VAULT_NAME
    • 3-24 alphanumeric characters
    • must begin with a letter
    • must be globally unique

resource group

az group create -n $RESOURCE_GROUP --location westus2

managed identity

Create the managed identity:

az identity create -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME

Save its clientId, id (ARM URI), and principalId (object ID) for later:

$MANAGED_IDENTITY_CLIENT_ID=az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query clientId -o tsv
$MANAGED_IDENTITY_ID=az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query id -o tsv
$MANAGED_IDENTITY_PRINCIPAL_ID=az identity show -g $RESOURCE_GROUP -n $MANAGED_IDENTITY_NAME --query principalId -o tsv

Key Vault

Create the Vault:

az keyvault create -g $RESOURCE_GROUP -n $KEY_VAULT_NAME --sku standard

Add an access policy for the managed identity:

az keyvault set-policy -n $KEY_VAULT_NAME --object-id $MANAGED_IDENTITY_PRINCIPAL_ID --secret-permissions set delete

container registry

az acr create -g $RESOURCE_GROUP -n $ACR_NAME --admin-enabled --sku basic

Kubernetes

Deploy the cluster (this will take several minutes):

az aks create -g $RESOURCE_GROUP -n $AKS_NAME --generate-ssh-keys --node-count 1 --disable-rbac --attach-acr $ACR_NAME

Grant the cluster's service principal permission to use the managed identity:

az role assignment create --role "Managed Identity Operator" --assignee $(az aks show -g $RESOURCE_GROUP -n $AKS_NAME --query servicePrincipalProfile.clientId -o tsv) --scope $MANAGED_IDENTITY_ID

build images

The test application must be packaged as a Docker image before deployment.

authenticate to ACR

az acr login -n $ACR_NAME

acquire the test code

git clone https://github.com/Azure/azure-sdk-for-js/ --branch master --single-branch --depth 1

The rest of this section assumes this working directory:

cd azure-sdk-for-js/sdk/identity/identity/test/manual-integration/kubernetes

build images and push them to the container registry

Set environment variables:

$REPOSITORY="$($ACR_NAME).azurecr.io"
$IMAGE_NAME="test-pod-identity"
$NODE_VERSION=10

Build an image:

docker build --no-cache --build-arg NODE_VERSION=$NODE_VERSION -t "$($REPOSITORY)/$($IMAGE_NAME):$($NODE_VERSION)" .

Push it to ACR:

docker push "$($REPOSITORY)/$($IMAGE_NAME):$($NODE_VERSION)"

run the test

install kubectl

az aks install-cli

authenticate kubectl and helm

az aks get-credentials -g $RESOURCE_GROUP -n $AKS_NAME

install tiller

helm init --wait

run the test script

npm install
tsc -p .
node ./run_test.js --client-id $MANAGED_IDENTITY_CLIENT_ID --resource-id $MANAGED_IDENTITY_ID --vault-url "https://$($KEY_VAULT_NAME).vault.azure.net" --repository $REPOSITORY --image-name $IMAGE_NAME --image-tag $NODE_VERSION

verify success

az keyvault secret show -n "secret-name-pod" --vault-name "$($KEY_VAULT_NAME)"

delete Azure resources

az group delete -n $RESOURCE_GROUP -y --no-wait

@XuGuang-Yao
Copy link

Hi @jonathandturner,

I am following above steps to do JS E2E test.

Below error occurred after running docker build --no-cache --build-arg NODE_VERSION=$NODE_VERSION -t "$($REPOSITORY)/$($IMAGE_NAME):$($NODE_VERSION)" .

image

Could you please help to update the test steps to resolve the error?

@XuGuang-Yao
Copy link

XuGuang-Yao commented May 13, 2020

@catalinaperalta -I am following the instrucations to do go E2E test.

Below error occurred when executing last command.

errGet http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com: context deadline exceedederror--- FAIL: TestManagedIdentityCredential_GetTokenInVMLiveStorage (364.16s)

image

The test passed last month but fials this time. Could you help to take a look to see if there is anything changed?

@XuGuang-Yao
Copy link

@chlowell - You might miss adding the list permission to key vault.

@XuGuang-Yao
Copy link

@jianghaolu -We am following the instrucations to do Java E2E test.

Below error occurred after running all commands.
image

We have tried for many times, but still the same error. Could you help to check what is causing this issue?

@kurtzeborn kurtzeborn added the Client This issue points to a problem in the data-plane of the library. label Sep 28, 2020
@JosueJoshua
Copy link

JosueJoshua commented Jan 27, 2021

Hi @jianghaolu @catalinaperalta - following E2E test
Container initialization failed in the sixth step(Start the test run jobs) of Run the test.
image

It is related to the vmss, so I try to access 169.254.169.254 using az cli(az vmss run-command invoke).
And then, it returns 400.
image

Container initialization failed, Could you help to check what is causing this issue?

@v-jiaodi
Copy link
Member

v-jiaodi commented Sep 27, 2021

Hi @jianghaolu @catalinaperalta. I am following above steps to do JS E2E test.

Below error occurred after running docker build --no-cache --build-arg NODE_VERSION=$NODE_VERSION -t "$($REPOSITORY)/$($IMAGE_NAME):$($NODE_VERSION)" .

image

I tried to update podName to podName:any and logs to logs:any in run_test.ts,then it can pass.

image
Do you think it's appropriate to modify it like this? Or do you have any ideas?

@chlowell
Copy link
Member Author

@sadasant @KarishmaGhiya for JS issues

@sadasant
Copy link
Contributor

Give us some time to sync with this issue.

@sadasant
Copy link
Contributor

@chlowell thank you, Charles!

@v-jiaodi This happens because we forgot to specify the typescript version we used to compile this project originally. I’ve made a pull request in which I’m fixing the TypeScript version to the latest one available, and I’m also fixing the types. In the future, this specific problem will not happen. Thank you for letting us know!

Here’s the PR: Azure/azure-sdk-for-js#17910

@sadasant
Copy link
Contributor

sadasant commented Oct 1, 2021

@chlowell , @v-jiaodi we have merged the update! Let me know if I can help with anything else.

@v-jiaodi
Copy link
Member

v-jiaodi commented Oct 8, 2021

@sadasant This issue has not been completely solved,the following is the error message:

image

@sadasant
Copy link
Contributor

sadasant commented Oct 8, 2021

I will investigate at first hour. Thank you!

@sadasant
Copy link
Contributor

sadasant commented Oct 8, 2021

I just saw the issue. It was my bad, I didn’t re-try building after I addressed the feedback I got. Here’s the fix: Azure/azure-sdk-for-js#18103

@v-jiaodi
Copy link
Member

v-jiaodi commented Oct 9, 2021

I use the branch test you mentioned, and this issue no longer exists.

@sadasant
Copy link
Contributor

sadasant commented Oct 9, 2021

Thank you!

@zhangjiale-64
Copy link

Hi @chlowell. The folder sdk/identity/tests/pod-identity for the Automate Pod Identity test in Azure/azure-sdk-for-python#33742 has been deleted. Is testing no longer necessary? If necessary, please provide relevant tests.

@chlowell
Copy link
Member Author

chlowell commented Mar 1, 2024

Given that Pod Identity is fully deprecated and won't ship another release, and we support it with our code for IMDS managed identity--which we continue to test--I don't think we need to continue testing Pod Identity separately. So, I'll close this issue. @christothes please reopen or comment if you disagree

@chlowell chlowell closed this as completed Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure.Identity Client This issue points to a problem in the data-plane of the library. EngSys This issue is impacting the engineering system. test-manual-pass
Projects
None yet
Development

No branches or pull requests