Skip to content

docs: add new quickstart for job launching #404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,16 @@
# TorchX


TorchX is a library containing standard DSLs for authoring and running PyTorch
related components for an E2E production ML pipeline.
TorchX is a universal job launcher for PyTorch applications.
TorchX is designed to have fast iteration time for training/research and support
for E2E production ML pipelines when you're ready.

For the latest documentation, please refer to our [website](https://pytorch.org/torchx).

## Quickstart

See the [quickstart guide](https://pytorch.org/torchx/latest/quickstart.html).


## Requirements
TorchX SDK (torchx):
Expand Down Expand Up @@ -58,10 +63,6 @@ $ pip install -e git+https://github.com/pytorch/torchx.git#egg=torchx
$ pip install -e git+https://github.com/pytorch/torchx.git#egg=torchx[kubernetes]
```

## Quickstart

See the [quickstart guide](https://pytorch.org/torchx/latest/quickstart.html).

## Contributing

We welcome PRs! See the [CONTRIBUTING](CONTRIBUTING.md) file.
Expand Down
3 changes: 3 additions & 0 deletions docs/source/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.torchxconfig
Dockerfile*
*.py
4 changes: 3 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,4 +361,6 @@ def handle_item(fieldarg, content):

<div id="is-nbsphinx"></div>
"""
# nbsphinx_execute = 'never'

if os.environ.get("SKIP_NB"):
nbsphinx_execute = "never"
149 changes: 149 additions & 0 deletions docs/source/custom_components.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
---
jupyter:
jupytext:
text_representation:
extension: .md
format_name: markdown
format_version: '1.1'
jupytext_version: 1.1.0
kernelspec:
display_name: Python 3
language: python
name: python3
---

# Custom Components

This is a guide on how to build a simple app and custom component spec
and launch it via two different schedulers.

See the [Quickstart Guide](quickstart.md) for installation and basic usage.

## Hello World

Lets start off with writing a simple "Hello World" python app. This is just a
normal python program and can contain anything you'd like.

<div class="admonition note">
<div class="admonition-title">Note</div>
This example uses Jupyter Notebook `%%writefile` to create local files for
example purposes. Under normal usage you would have these as standalone files.
</div>

```python
%%writefile my_app.py

import sys
import argparse

def main(user: str) -> None:
print(f"Hello, {user}!")

if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Hello world app"
)
parser.add_argument(
"--user",
type=str,
help="the person to greet",
required=True,
)
args = parser.parse_args(sys.argv[1:])

main(args.user)
```

Now that we have an app we can write the component file for it. This
function allows us to reuse and share our app in a user friendly way.

We can use this component from the `torchx` cli or programmatically as part of a
pipeline.

```python
%%writefile my_component.py

import torchx.specs as specs

def greet(user: str, image: str = "my_app:latest") -> specs.AppDef:
return specs.AppDef(
name="hello_world",
roles=[
specs.Role(
name="greeter",
image=image,
entrypoint="python",
args=[
"-m", "my_app",
"--user", user,
],
)
],
)
```

We can execute our component via `torchx run`. The
`local_cwd` scheduler executes the component relative to the current directory.

```sh
torchx run --scheduler local_cwd my_component.py:greet --user "your name"
```

If we want to run in other environments, we can build a Docker container so we
can run our component in Docker enabled environments such as Kubernetes or via
the local Docker scheduler.

<div class="admonition note">
<div class="admonition-title">Note</div>
This requires Docker installed and won't work in environments such as Google
Colab. If you have not done so already follow the install instructions on:
[https://docs.docker.com/get-docker/](https://docs.docker.com/get-docker/)</a>
</div>

```python
%%writefile Dockerfile.custom

FROM ghcr.io/pytorch/torchx:0.1.0rc1

ADD my_app.py .
```

Once we have the Dockerfile created we can create our docker image.

```sh
docker build -t my_app:latest -f Dockerfile.custom .
```

We can then launch it on the local scheduler.

```sh
torchx run --scheduler local_docker my_component.py:greet --image "my_app:latest" --user "your name"
```

If you have a Kubernetes cluster you can use the [Kubernetes scheduler](schedulers/kubernetes.rst) to launch
this on the cluster instead.


<!-- #md -->
```sh
$ docker push my_app:latest
$ torchx run --scheduler kubernetes my_component.py:greet --image "my_app:latest" --user "your name"
```
<!-- #endmd -->


## Builtins

TorchX also provides a number of builtin components with premade images. You can discover
them via:

```sh
torchx builtins
```

You can use these either from the CLI, from a pipeline or programmatically like
you would any other component.

```sh
torchx run utils.echo --msg "Hello :)"
```
47 changes: 21 additions & 26 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,37 +3,33 @@
TorchX
==================

TorchX is an SDK for quickly building and deploying ML applications from R&D to production.
It offers various builtin components that encode MLOps best practices and make advanced
features like distributed training and hyperparameter optimization accessible to all.
Users can get started with TorchX with no added setup cost since it supports popular
ML schedulers and pipeline orchestrators that are already widely adopted and deployed
in production.
TorchX is a universal job launcher for PyTorch applications.
TorchX is designed to have fast iteration time for training/research and support
for E2E production ML pipelines when you're ready.

No two production environments are the same. To comply with various use cases, TorchX's
core APIs allow tons of customization at well-defined extension points so that even the
most unique applications can be serviced without customizing the whole vertical stack.
**GETTING STARTED?** Follow the :ref:`quickstart guide<quickstart:Quickstart>`.


**GETTING STARTED?** First learn the :ref:`basic concepts<basics:Basic Concepts>` and
follow the :ref:`quickstart guide<quickstart:Quickstart - Custom Components>`.

.. image:: torchx_index_diag.png

In 1-2-3
-----------------

**01 DEFINE OR CHOOSE** Start by :ref:`writing a component<components/overview:Overview>` -- a python
function that returns an AppDef object for your application. Or you can choose one of the
:ref:`builtin components<Components>`.
Step 1. Install

.. code-block:: shell

pip install torchx[dev]

Step 2. Run Locally

.. code-block:: shell

**02 RUN AS A JOB** Once you've defined or chosen a component, you can :ref:`run it<runner:torchx.runner>`
by submitting it as a job in one of the supported :ref:`Schedulers<Schedulers>`. TorchX supports several
popular ones, such as Kubernetes and SLURM out of the box.
torchx run --scheduler local_cwd utils.python --script my_app.py "Hello, localhost!"

**03 CONVERT TO PIPELINE** In production, components are often run as a workflow (aka pipeline).
TorchX components can be converted to pipeline stages by passing them through the :py:mod:`torchx.pipelines`
adapter. :ref:`Pipelines<Pipelines>` lists the pipeline orchestrators supported out of the box.
Step 3. Run Remotely

.. code-block:: shell

torchx run --scheduler kubernetes utils.python --script my_app.py "Hello, Kubernetes!"


Documentation
Expand All @@ -43,13 +39,12 @@ Documentation
:maxdepth: 1
:caption: Usage

basics
quickstart.md
cli

basics
runner.config

advanced
custom_components.md


Works With
Expand Down
19 changes: 6 additions & 13 deletions docs/source/pipelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,12 @@ torchx.pipelines
.. automodule:: torchx.pipelines
.. currentmodule:: torchx.pipelines

torchx.pipelines.kfp
#####################
All Pipelines
~~~~~~~~~~~~~~~~

.. image:: pipeline_kfp_diagram.png
.. toctree::
:maxdepth: 1
:glob:

.. automodule:: torchx.pipelines.kfp
.. currentmodule:: torchx.pipelines.kfp
pipelines/*

.. currentmodule:: torchx.pipelines.kfp.adapter

.. autofunction:: container_from_app
.. autofunction:: resource_from_app
.. autofunction:: component_from_app
.. autofunction:: component_spec_from_app

.. autoclass:: ContainerFactory
20 changes: 18 additions & 2 deletions docs/source/pipelines/kfp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,23 @@ Kubeflow Pipelines
======================

TorchX provides an adapter to run TorchX components as part of Kubeflow
Pipelines. See :ref:`examples_pipelines/index:KubeFlow Pipelines Examples` and
the :mod:`torchx.pipelines.kfp` for API reference.
Pipelines. See :ref:`examples_pipelines/index:KubeFlow Pipelines Examples`.

.. image:: kfp_diagram.jpg

torchx.pipelines.kfp
#####################

.. image:: pipeline_kfp_diagram.png

.. automodule:: torchx.pipelines.kfp
.. currentmodule:: torchx.pipelines.kfp

.. currentmodule:: torchx.pipelines.kfp.adapter

.. autofunction:: container_from_app
.. autofunction:: resource_from_app
.. autofunction:: component_from_app
.. autofunction:: component_spec_from_app

.. autoclass:: ContainerFactory
Loading