Pachyderm Examples is a curated list of examples that use Pachyderm to accomplish various tasks.
- Intro to Pachyderm Tutorial - A notebook introduction to Pachyderm, using the
pachctl
command line utility to illustrate the basics of Pachyderm data repositories and pipelines - Boston Housing Prices - A machine learning pipeline to train a regression model on the Boston Housing Dataset to predict the value of homes.
- Boston Housing Prices (Intermediate) - Extends the original Boston Housing Prices example to show a multi-pipeline DAG and data rollbacks.
- Market Sentiment - Train and deploy a fully automated financial market sentiment BERT model. As data is manually labeled, the model will automatically retrain and deploy.
- Object Detection - Train an object detector on the COCO128 dataset with Lightning Flash, modify predictions with Label Studio, and version everything in Pachyderm.
- JupyterLab Pachyderm Mount Extension - A notebook showing how to use the JupyterLab Pachyderm Mount Extension to mount Pachyderm data repositories into your Notebook environment.
- Jsonnet Pipeline Specs - A notebook introducing and showing how use Jsonnet Pipeline Specs to templatize common pipelines.
- SAME Project - A notebook showing how to do Pachyderm pipeline development with the SAME Project.
- Label Studio Integration - Incorporate data versioning into any labeling project with Label Studio and Pachyderm.
- Superb AI Integration - Version labeled image datasets created in Superb AI Suite using a cron pipeline.
- Toloka Integration - Uses Pachyderm to create crowdsourced annotation jobs for news headlines in Toloka, aggregate the labeled data, and train a model.
- BigQuery - Connector to ingests the result of a BigQuery query into Pachyderm as a parquet file.
- Churn Prediction with Snowflake - Create a churn analysis model for a music streaming service with Pachyderm and Snowflake using the Data Warehouse integration.
- Boston Housing Prices (Intermediate) - Extends the original Boston Housing Prices example to show a multi-pipeline DAG and data rollbacks.
- Breast Cancer Detection - A breast cancer detection system based on radiology scans scaled and visualized using Pachyderm.
- AutoML - A Pachyderm pipeline that uses the mljar-supervised to train a machine learning model on a CSV file.
- Market Sentiment - Train and deploy a fully automated financial market sentiment BERT model. As data is manually labeled, the model will automatically retrain and deploy.
- Apache Spark - MLflow Integration - End-to-end example demostrating the full ML training process of a fraud detection model with Spark, MLlib, MLflow, and Pachyderm.
- Weights and Biases - Log pipelines running in Pachyderm to Weights and Biases.
- ClearML Integration - Log Pachyderm experiments to ClearML's experiment montioring platform, using Pachyderm Secrets.
- Pachyderm - Seldon - Community example showing monitoring and provenance for machine learning models with Pachyderm and Seldon.
- Seldon (Market Sentiment) - Deploy the model created in the Market Sentiment example with Seldon Deploy.