MLOps Repository

Overview

Welcome to the MLOps Repository! This repository is dedicated to sharing reading contents, labs and exercises for the MLOps (Machine Learning Operations) course at Northeastern University. The primary goal of this repository is to provide a centralized platform for students, instructors, and anyone interested in MLOps to access and collaborate on course-related materials. You can learn more on Machine learning topics by watching my videos on Youtube or visit my Website.

Introduction

MLOps is an emerging discipline that focuses on the collaboration and communication of both data scientists and IT professionals while automating and streamlining the machine learning lifecycle. It bridges the gap between machine learning development and production deployment, ensuring that machine learning models are scalable, reproducible, and maintainable. This repository serves as a resource hub for students and instructors of Northeastern University's MLOps course.

Course Description

The MLOps course at Northeastern University is designed to provide students with a comprehensive understanding of the MLOps field. Throughout the course, students will learn how to:

Build end-to-end machine learning pipelines
Deploy machine learning models to production
Monitor and maintain ML systems
Implement CI/CD/CM/CT (Continuous Integration/Continuous Deployment/Continuous Monitoring/Continuous Training) for ML
Containerize and orchestrate ML workloads
Handle data drift and model retraining

This repository hosts the labs, code samples, and documentation related to these topics.

Labs Content

This repository offers a series of hands-on labs designed to enhance your understanding of MLOps concepts. Each lab focuses on a specific aspect of the machine learning lifecycle, providing practical experience with tools and methodologies essential for deploying and managing machine learning models in production environments.

API Labs
- Objective: Learn to develop and deploy APIs for ML models.
- Sub-Labs:
  - FLASK_GCP_LAB: Flask lab data.
  - FastAPI Labs: FastAPI lab details.
  - Streamlit Labs: Streamlit README - updated.
Airflow Labs
- Objective: Gain practical experience with Apache Airflow for orchestrating complex data workflows.
- Sub-Labs:
  - Lab 1: Basic Airflow setup and DAGs.
  - Lab 2: Advanced DAG dependencies and scheduling.
  - assets: Contains additional assets for Airflow labs.
CloudFunction Labs
- Objective: Learn how to deploy lightweight functions using cloud-based services.
- Sub-Labs:
  - Lab1-CloudFunction Setup: Setting up Google Cloud Functions.
  - Lab2-CloudFunction Intermediate: Intermediate Cloud Function concepts and use cases.
Data Labs
- Objective: Understand data engineering and preprocessing steps.
- Sub-Labs:
  - Apache: Apache setup for data handling.
  - DVC Labs/Lab 1: DVC setup and basic commands.
  - Data Labeling Labs: Lab focused on data labeling processes.
Data Storage & Warehouse Labs
- Objective: Explore data storage solutions and data warehousing.
- Sub-Labs:
  - Lab1: Introduction to data warehousing.
  - Lab2: Advanced data storage techniques.
  - Lab3: Optimization and data retrieval practices.
Docker Container Labs
- Objective: Learn containerization techniques for ML applications.
- Sub-Labs:
  - Week7_Docker_Container: Introduction to Docker containers.
  - Week8_Docker_Container: Advanced Docker techniques and orchestration.
ELK Labs
- Objective: Set up logging and monitoring using the ELK stack.
- Sub-Labs:
  - Lab1_Setup_Windows_WSL_Ubuntu: ELK setup on Windows with WSL.
  - Lab2_ELK_Setup_Mac: ELK setup on macOS.
  - Lab3_Example: Example of ELK in practice.
Experiment Tracking Labs
- Objective: Track and manage ML experiments.
- Sub-Labs:
  - Logging Labs: Tracking logs for model training.
  - Mlflow Labs: Using MLflow for experiment tracking.
GCP Labs
- Cloud Composer Labs: Set up and manage workflows with Cloud Composer.
- Compute Engine Labs: Hands-on with Google Compute Engine.
- KServe Labs: Serving ML models with KServe on Kubernetes.
- Kubernetes Labs: Running and managing containers on GKE.
- Vertex AI Labs: End-to-end ML workflows with Vertex AI.
GitHub Labs
- Objective: Implement GitHub Actions for CI/CD.
- Sub-Labs:
  - GitHub_Actions_GCP_Lab_beginner: Beginner-level CI/CD with GitHub Actions.
  - Lab1: Basics of GitHub Actions.
  - Lab2: Intermediate CI/CD practices with GitHub.
  - github-actions-gcp-intermediate-lab: Intermediate GCP integration with GitHub Actions.
Kubeflow Labs
- Objective: Orchestrate ML workflows with Kubeflow.
- Sub-Labs:
  - Lab1-Kubeflow Setup: Setting up Kubeflow environment.
  - Lab2-Kubeflow Katib: Hyperparameter tuning with Katib in Kubeflow.
MLMD Labs
- Objective: Understand ML Metadata (MLMD) for tracking metadata.
- Sub-Labs:
  - Lab1: Introduction to ML metadata concepts.
  - Lab2: Advanced usage and querying of ML metadata.
  - assets: Supporting materials and assets for MLMD labs.
TensorFlow Labs
- Objective: Gain hands-on experience with TensorFlow for ML model development.
- Sub-Labs:
  - TFDV Labs: TensorFlow Data Validation labs.
  - TFDV TFX Installation: Setting up TFX and TFDV.
  - TFT Labs: TensorFlow Transform labs.
  - TFX Labs: TensorFlow Extended for production pipelines.

Each lab is accompanied by detailed instructions and code examples to facilitate hands-on learning. It's recommended to follow the labs sequentially, as concepts build upon each other. For additional resources and support, refer to the Reading Materials section of this repository.

Getting Started

To get started with the labs and exercises in this repository, please follow these steps:

Clone this repository to your local machine.
Navigate to the specific lab you are interested in.
Read the lab instructions and review any accompanying documentation.
Follow the provided code samples and examples to complete the lab exercises.
Feel free to explore, modify, and experiment with the code to deepen your understanding.

For more detailed information on each lab and prerequisites, please refer to the lab's README or documentation.

Contributing

Contributions to this repository are welcome! If you are a student or instructor and would like to contribute your own labs, improvements, or corrections, please follow these guidelines:

Fork this repository.
Create a branch for your changes.
Make your changes and commit them with clear, concise messages.
Test your changes to ensure they work as expected.
Submit a pull request to the main repository.

Your contributions will help improve the overall quality of the labs and benefit the entire MLOps community.

Reference:

The reading materials of this repo was collected from Coursera under the Creative Commons License.

License

This repository is open-source and is distributed under the Creative Commons License. Please review the license for more details on how you can use and share the content within this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 656 Commits
.github		.github
Labs		Labs
Reading Materials		Reading Materials
.dvcignore		.dvcignore
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
IE7374_MLOPS_syllabus.pdf		IE7374_MLOPS_syllabus.pdf
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps Repository

Overview

Table of Contents

Introduction

Course Description

Labs Content

Getting Started

Contributing

Reference:

License

🌟 Star History

Contributors

About

Releases 3

Packages

Contributors 13

Languages

License

raminmohammadi/MLOps

Folders and files

Latest commit

History

Repository files navigation

MLOps Repository

Overview

Table of Contents

Introduction

Course Description

Labs Content

Getting Started

Contributing

Reference:

License

🌟 Star History

Contributors

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 13

Languages

Packages