Right now, terraform official documentation only covers an example of secured workspace, without secured training computes, and without secured inference. Examples on how to properly secure the ml extension on AKS are also non existent for now.
This repo was made to demonstrate :
- A hub & spoke toplogy
- secure workspace
- secure training computes (no public IP compute)
- secure inference : dedicated AKS spoke & no public IP to AzureML extension
This was made to be as easy as possible to deploy.
- Feel free to edit variable defaults in variables.tf or to create your own tfvars
- az login
- if required: az account set --subscription
- terraform init
- terraform apply
If you are not interested by the AKS part, you can just delete the reference to the module in the main.tf line 42 to 53.
This example was made fully in terraform to limit the amount of tool required. For a production deployment, you might consider the following change :
- Leverage CICD pipeline with a service principal to run terraform
- Use Gitops methodology to manage AKS internals instead of doing it like I did through kubernetes provider
- Improving AKS security even more (RBAC authentication, TLS, etc). I chose not to add those because I wanted for the code to remain as simple as possible while focusing on the AML side.
- Instead of using default firewall rule with tag AzureMonitor, you can Secure Azure Monitor with Azure Monitor Private Link to prevent data exfiltration on Log Analytics.
- You can add a service endpoint policy to prevent data exfiltration on Storage Account
Here is some additionnal content that might interest you :
- No public IP training compute - network explained
- Inferencing Environment
- How to use afterwards
- Troubleshooting