From d4ad7854492ecbef8b78e8599513a2dfb3d5df9c Mon Sep 17 00:00:00 2001 From: savin Date: Fri, 28 Feb 2025 23:05:28 -0800 Subject: [PATCH 1/3] edit readme --- ADOPTERS.md | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 55 ++++++++++++++++++---------------------- 2 files changed, 96 insertions(+), 31 deletions(-) create mode 100644 ADOPTERS.md diff --git a/ADOPTERS.md b/ADOPTERS.md new file mode 100644 index 00000000000..e7701c9aa98 --- /dev/null +++ b/ADOPTERS.md @@ -0,0 +1,72 @@ +# Adopters + +Below is a partial list of organizations using Metaflow in production. If you'd like to be included in this list, please raise a pull request + +- [23andMe](https://www.23andme.com) +- [Adept](https://www.adept.ai) +- [Amazon](https://www.amazon.com) +- [Amazon Prime Video](https://www.primevideo.com) +- [Attentive](https://www.attentive.com) +- [Autodesk](https://www.autodesk.com) +- [Bosch](https://www.bosch.com) +- [Boston Consulting Group](https://www.bcg.com) +- [Carsales](https://www.carsales.com.au) +- [Carta](https://carta.com) +- [Chess.com](https://www.chess.com) +- [CloudKitchens](https://www.cloudkitchens.com) +- [Coveo](https://www.coveo.com) +- [Crexi](https://www.crexi.com) +- [Dell](https://www.dell.com) +- [Deliveroo](https://deliveroo.com) +- [DeliveryHero](https://deliveryhero.com) +- [Disney](https://disney.com) +- [Doordash](https://doordash.com) +- [DraftKings](https://www.draftkings.com) +- [DTN](https://www.dtn.com) +- [DuckDuckGo](https://www.duckduckgo.com) +- [Dyson](https://www.dyson.com) +- [Equilibrium Energy](https://www.equilibriumenergy.com) +- [Forward Financing](https://www.forwardfinancing.com) +- [Fortum](https://www.fortum.com) +- [Genesys](https://www.genesys.com) +- [Goldman Sachs](https://www.goldmansachs.com) +- [Gradle](https://www.gradle.com) +- [GSK](https://www.gsk.com) +- [Intel](https://www.intel.com) +- [Intuitive Surgical](https://www.intuitivesurgical.com) +- [JPMorgan Chase](https://www.jpmorganchase.com) +- [Lightricks](https://www.lightricks.com) +- [Medtronic](https://www.medtronic.com) +- [Merck](https://www.merck.com) +- [Morningstar](https://www.morningstar.com) +- [Mozilla](https://www.mozilla.org) +- [Netflix](https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9) +- [Nextdoor](https://www.nextdoor.com) +- [Porsche](https://www.porsche.com) +- [Pratilipi](https://www.pratilipi.com) +- [Rad.ai](https://www.rad.ai) +- [Ramp](https://ramp.com) +- [Realtor](https://www.realtor.com) +- [Roku](https://www.roku.com) +- [S&P Global](https://www.spglobal.com) +- [Sainsbury's](https://www.sainsburys.co.uk) +- [Salk Institute](https://www.salk.edu) +- [Sanofi](https://www.sanofi.com) +- [SAP](https://www.sap.com) +- [SEEK](https://www.seek.com.au) +- [Shutterstock](https://www.shutterstock.com) +- [Stanford](https://www.stanford.edu) +- [Thoughtworks](https://www.thoughtworks.com) +- [Too Good To Go](https://www.toogoodtogo.com) +- [Toyota](https://www.toyota.com) +- [Upstart](https://www.upstart.com) +- [Veriff](https://www.veriff.com) +- [Verisk](https://www.verisk.com) +- [Vouch Insurance](https://www.vouchinsurance.com) +- [Wadhwani AI](https://www.wadhwani.ai) +- [Warner Media](https://www.warnermedia.com) +- [Workiva](https://www.workiva.com) +- [Zendesk](https://www.zendesk.com) +- [Zillow](https://www.zillow.com) +- [Zipline](https://www.zipline.com) +- [Zynga](https://www.zynga.com) \ No newline at end of file diff --git a/README.md b/README.md index f1a0b13ef23..ea5baabe15b 100644 --- a/README.md +++ b/README.md @@ -2,37 +2,49 @@ # Metaflow -Metaflow is a human-friendly library that helps scientists and engineers build and manage real-life data science projects. Metaflow was [originally developed at Netflix](https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9) to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning. +[Metaflow](https://metaflow.org) is a human-centric framework designed to help scientists and engineers **build and manage real-life AI and ML systems**. Serving teams of all sizes and scale, Metaflow streamlines the entire lifecycle — from prototyping in notebooks to production deployments — making it easier than ever to deliver robust solutions. -For more information, see [Metaflow's website](https://metaflow.org) and [documentation](https://docs.metaflow.org). +Originally developed at [Netflix](https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9), Metaflow is designed to boost the productivity for research and engineering teams working on [a wide variety of projects](https://netflixtechblog.com/supporting-diverse-ml-systems-at-netflix-2d2e6b6d205d), from classical statistics to state-of-the-art deep learning and foundation models. Metaflow helps unify code, data, and compute at every stage, ensuring robust, end-to-end systems for real-world AI and ML. + +Today, Metaflow powers thousands of AI and ML experiences across a diverse array of companies, large and small, including Amazon, Doordash, Dyson, Goldman Sachs, Ramp, and [more](ADOPTERS.md). ## From prototype to production (and back) -Metaflow provides a simple, friendly API that covers foundational needs of ML, AI, and data science projects: +Metaflow provides a simple, friendly [API](https://docs.metaflow.org) that covers foundational needs of AI and ML systems: -1. [Rapid local prototyping](https://docs.metaflow.org/metaflow/basics), [support for notebooks](https://docs.metaflow.org/metaflow/visualizing-results), and [built-in experiment tracking and versioning](https://docs.metaflow.org/metaflow/client). -2. [Horizontal and vertical scalability to the cloud](https://docs.metaflow.org/scaling/remote-tasks/introduction), utilizing both CPUs and GPUs, and [fast data access](https://docs.metaflow.org/scaling/data). -3. [Managing dependencies](https://docs.metaflow.org/scaling/dependencies) and [one-click deployments to highly available production orchestrators](https://docs.metaflow.org/production/introduction). +1. [Rapid local prototyping](https://docs.metaflow.org/metaflow/basics), [support for notebooks](https://docs.metaflow.org/metaflow/managing-flows/notebook-runs), and built-in support for [experiment tracking, versioning](https://docs.metaflow.org/metaflow/client) and [visualization](https://docs.metaflow.org/metaflow/visualizing-results). +2. [Horizontal and vertical scalability in any cloud](https://docs.metaflow.org/scaling/remote-tasks/introduction), utilizing both CPUs and GPUs, with [fast data access](https://docs.metaflow.org/scaling/data) for running [massive embarrassingly parallel](https://docs.metaflow.org/metaflow/basics#foreach) as well as [gang-scheduled](https://docs.metaflow.org/scaling/remote-tasks/distributed-computing) compute workloads [reliably](https://docs.metaflow.org/scaling/failures) and [efficiently](https://docs.metaflow.org/scaling/checkpoint/introduction). +3. [Managing dependencies](https://docs.metaflow.org/scaling/dependencies) and [one-click deployments to highly available production orchestrators](https://docs.metaflow.org/production/introduction) with built in support for [reactive orchestration](https://docs.metaflow.org/production/event-triggering). + +For full documentation, check out our [API Reference](https://docs.metaflow.org/api) or see our [Release Notes](https://github.com/Netflix/metaflow/releases) for the latest features and improvements. ## Getting started -Getting up and running is easy. If you don't know where to start, [Metaflow sandbox](https://outerbounds.com/sandbox) will have you running and exploring Metaflow in seconds. +Getting up and running is easy. If you don't know where to start, [Metaflow sandbox](https://outerbounds.com/sandbox) will have you running and exploring in seconds. -### Installing Metaflow in your Python environment +### Installing Metaflow -To install Metaflow in your local environment, you can install from [PyPi](https://pypi.org/project/metaflow/): +To install Metaflow in your Python environment from [PyPI](https://pypi.org/project/metaflow/): ```sh pip install metaflow ``` -Alternatively, you can also install from [conda-forge](https://anaconda.org/conda-forge/metaflow): +Alternatively, using [conda-forge](https://anaconda.org/conda-forge/metaflow): ```sh conda install -c conda-forge metaflow ``` -If you are eager to try out Metaflow in practice, you can start with the [tutorial](https://docs.metaflow.org/getting-started/tutorials). After the tutorial, you can learn more about how Metaflow works [here](https://docs.metaflow.org/metaflow/basics). + +Once installed, a great way to get started is by following our [tutorial](https://docs.metaflow.org/getting-started/tutorials). It walks you through creating and running your first Metaflow flow step by step. + +For more details on Metaflow’s features and best practices, check out: +- [How Metaflow works](https://docs.metaflow.org/metaflow/basics) +- [Additional resources](https://docs.metaflow.org/introduction/metaflow-resources) + +If you need help, don’t hesitate to reach out on our [Slack community](http://slack.outerbounds.co/)! + ### Deploying infrastructure for Metaflow in your cloud @@ -42,28 +54,9 @@ While you can get started with Metaflow easily on your laptop, the main benefits and to [deploy to production-grade workflow orchestrators](https://docs.metaflow.org/production/introduction). To benefit from these features, follow this [guide](https://outerbounds.com/engineering/welcome/) to configure Metaflow and the infrastructure behind it appropriately. -## [Resources](https://docs.metaflow.org/introduction/metaflow-resources) - -### [Slack Community](http://slack.outerbounds.co/) -An active [community](http://slack.outerbounds.co/) of thousands of data scientists and ML engineers discussing the ins-and-outs of applied machine learning. - -### [Tutorials](https://outerbounds.com/docs/tutorials-index/) -- [Introduction to Metaflow](https://outerbounds.com/docs/intro-tutorial-overview/) -- [Natural Language Processing with Metaflow](https://outerbounds.com/docs/nlp-tutorial-overview/) -- [Computer Vision with Metaflow](https://outerbounds.com/docs/cv-tutorial-overview/) -- [Recommender Systems with Metaflow](https://outerbounds.com/docs/recsys-tutorial-overview/) -- And more advanced content [here](https://outerbounds.com/docs/tutorials-index/) - -### [Generative AI and LLM use cases](https://outerbounds.com/blog/?category=Foundation%20Models) -- [Infrastructure Stack for Large Language Models](https://outerbounds.com/blog/llm-infrastructure-stack/) -- [Parallelizing Stable Diffusion for Production Use Cases](https://outerbounds.com/blog/parallelizing-stable-diffusion-production-use-cases/) -- [Whisper with Metaflow on Kubernetes](https://outerbounds.com/blog/whisper-kubernetes/) -- [Training a Large Language Model With Metaflow, Featuring Dolly](https://outerbounds.com/blog/train-dolly-metaflow/) ## Get in touch -There are several ways to get in touch with us: -- [Slack Community](http://slack.outerbounds.co/) -- [Github Issues](https://github.com/Netflix/metaflow/issues) +We'd love to hear from you. Join our community [Slack workspace]((http://slack.outerbounds.co/))! ## Contributing We welcome contributions to Metaflow. Please see our [contribution guide](https://docs.metaflow.org/introduction/contributing-to-metaflow) for more details. From 769b58e573b53f131b795d47703628df6324d247 Mon Sep 17 00:00:00 2001 From: savin Date: Thu, 6 Mar 2025 09:57:05 -0800 Subject: [PATCH 2/3] add stats from qcon talk --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index ea5baabe15b..67462d7274a 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ Originally developed at [Netflix](https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9), Metaflow is designed to boost the productivity for research and engineering teams working on [a wide variety of projects](https://netflixtechblog.com/supporting-diverse-ml-systems-at-netflix-2d2e6b6d205d), from classical statistics to state-of-the-art deep learning and foundation models. Metaflow helps unify code, data, and compute at every stage, ensuring robust, end-to-end systems for real-world AI and ML. -Today, Metaflow powers thousands of AI and ML experiences across a diverse array of companies, large and small, including Amazon, Doordash, Dyson, Goldman Sachs, Ramp, and [more](ADOPTERS.md). +Today, Metaflow powers thousands of AI and ML experiences across a diverse array of companies, large and small, including Amazon, Doordash, Dyson, Goldman Sachs, Ramp, and [more](ADOPTERS.md). At Netflix, Metaflow supports over 3000 AI and ML systems, executing hundreds of millions of data-intensive, high-performance compute jobs and managing tens of petabytes of models and artifacts in it's datastore across hundreds of users within Netflix's AI, ML, data science, and engineering teams. ## From prototype to production (and back) From 0f947aef93474ae60f44744abf77d4516a18ae9f Mon Sep 17 00:00:00 2001 From: savin Date: Thu, 6 Mar 2025 10:04:03 -0800 Subject: [PATCH 3/3] language --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 67462d7274a..5088c2f0a54 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ # Metaflow -[Metaflow](https://metaflow.org) is a human-centric framework designed to help scientists and engineers **build and manage real-life AI and ML systems**. Serving teams of all sizes and scale, Metaflow streamlines the entire lifecycle — from prototyping in notebooks to production deployments — making it easier than ever to deliver robust solutions. +[Metaflow](https://metaflow.org) is a human-centric framework designed to help scientists and engineers **build and manage real-life AI and ML systems**. Serving teams of all sizes and scale, Metaflow streamlines the entire lifecycle, allowing rapid iteration from prototyping in notebooks to robust, maintainable production deployments — making it easier than ever to deliver robust systems. Originally developed at [Netflix](https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9), Metaflow is designed to boost the productivity for research and engineering teams working on [a wide variety of projects](https://netflixtechblog.com/supporting-diverse-ml-systems-at-netflix-2d2e6b6d205d), from classical statistics to state-of-the-art deep learning and foundation models. Metaflow helps unify code, data, and compute at every stage, ensuring robust, end-to-end systems for real-world AI and ML. @@ -10,12 +10,12 @@ Today, Metaflow powers thousands of AI and ML experiences across a diverse array ## From prototype to production (and back) -Metaflow provides a simple, friendly [API](https://docs.metaflow.org) that covers foundational needs of AI and ML systems: +Metaflow provides a simple and friendly pythonic [API](https://docs.metaflow.org) that covers foundational needs of AI and ML systems: 1. [Rapid local prototyping](https://docs.metaflow.org/metaflow/basics), [support for notebooks](https://docs.metaflow.org/metaflow/managing-flows/notebook-runs), and built-in support for [experiment tracking, versioning](https://docs.metaflow.org/metaflow/client) and [visualization](https://docs.metaflow.org/metaflow/visualizing-results). -2. [Horizontal and vertical scalability in any cloud](https://docs.metaflow.org/scaling/remote-tasks/introduction), utilizing both CPUs and GPUs, with [fast data access](https://docs.metaflow.org/scaling/data) for running [massive embarrassingly parallel](https://docs.metaflow.org/metaflow/basics#foreach) as well as [gang-scheduled](https://docs.metaflow.org/scaling/remote-tasks/distributed-computing) compute workloads [reliably](https://docs.metaflow.org/scaling/failures) and [efficiently](https://docs.metaflow.org/scaling/checkpoint/introduction). -3. [Managing dependencies](https://docs.metaflow.org/scaling/dependencies) and [one-click deployments to highly available production orchestrators](https://docs.metaflow.org/production/introduction) with built in support for [reactive orchestration](https://docs.metaflow.org/production/event-triggering). +2. [Effortlessly scale horizontally and vertically in your cloud](https://docs.metaflow.org/scaling/remote-tasks/introduction), utilizing both CPUs and GPUs, with [fast data access](https://docs.metaflow.org/scaling/data) for running [massive embarrassingly parallel](https://docs.metaflow.org/metaflow/basics#foreach) as well as [gang-scheduled](https://docs.metaflow.org/scaling/remote-tasks/distributed-computing) compute workloads [reliably](https://docs.metaflow.org/scaling/failures) and [efficiently](https://docs.metaflow.org/scaling/checkpoint/introduction). +3. [Easily manage dependencies](https://docs.metaflow.org/scaling/dependencies) and [deploy with one-click](https://docs.metaflow.org/production/introduction) to highly available production orchestrators with built in support for [reactive orchestration](https://docs.metaflow.org/production/event-triggering). For full documentation, check out our [API Reference](https://docs.metaflow.org/api) or see our [Release Notes](https://github.com/Netflix/metaflow/releases) for the latest features and improvements.