An important aspect of Machine Learning (ML) projects is the transition from the manual experimentation with Jupyter notebooks and similar to an architecture, where workflows for building, training, deploying and maintaining ML models in production are automated and orchestrated. In order to achieve this, an operating model between different personas such as Data Scientists, Data Engineers, ML Engineers, DevOps Engineers, IT and business stakeholders needs to be established. Further, the data and model lifecycle and the underlying workflows need to be defined, as well as the responsibilities of the different personas in these areas. This collection of practices is called Machine Learning Operations (MLOps).
This repository contains a set baseline infrastructure for an MLOps environment on AWS for a single AWS account. The infrastructure is defined with Terraform and is built around the Amazon SageMaker service.
The 3 main components in the repository are:
mlops_infra will deploy a data science exploration environment for your data scientists to explore and train their ML models inside a SageMaker studio environment. Please note that the networking created by mlops_infra is a starter example and that you can also adapt the repository to import your existing VPCs created by your organization instead of creating its own VPCs. The repository will also create example SageMaker users (Data Scientist 1 and Data Scientist 2) and associated roles and policies.
This terraform project is used to bootstrap an account for ML and includes the following modules:
- modules/networking: Deploy vpc, subnet & vpc endpoints for SageMaker
- modules/code_commit: Deploy codecommit repository & associate it as SageMaker repository
- modules/kms: Deploy KMS key for encryption of SageMaker resources
- modules/iam: Deploy Sagemaker roles & policies
- modules/sagemaker_studio: Deploy SageMaker Studio with users and default Jupyterhub app, as well as enabling SageMaker projects
This terraform project is used to bootstrap service catalog with a portfolio and example terraform based SageMaker project. It allows deploying many different organizational SageMaker project templates.
- modules/sagemaker_project_template: Create Service Catalog Portolio & products
These folders contain the "seed code", which is the code that will be initialized when a new SageMaker project is created in SageMaker Studio. The seed code is associated with the corresponding template in the mlops_template code. The seed code should be 100% generic and should provide the baseline for new ML projects to build on.
- seed_code/build_app: Example terraform based model build application using SageMaker Pipelines, Codecommit & Codebuild
- seed_code/deploy_app: Example terraform based model deployment application that deploys trained models to SageMaker endpoints
- Terraform
- Git
- AWS CLI v2
Navigate to the 'mlops_infra' directory with cd mlops_infra
and follow instructions:
mlops_infra
Navigate to the 'mlops_templates' directory with cd mlops_templates
and follow instructions:
mlops_templates