diff --git a/docs/book/component-guide/step-operators/README.md b/docs/book/component-guide/step-operators/README.md index b9fd4589530..320b777cf62 100644 --- a/docs/book/component-guide/step-operators/README.md +++ b/docs/book/component-guide/step-operators/README.md @@ -39,6 +39,35 @@ zenml step-operator flavor list You don't need to directly interact with any ZenML step operator in your code. As long as the step operator that you want to use is part of your active [ZenML stack](https://docs.zenml.io/user-guides/production-guide/understand-stacks), you can simply specify it in the `@step` decorator of your step. +#### Using Step Operators in Steps + +- **Step Operator Parameter**: The `step_operator` parameter in the `@step` decorator is used to specify the step operator for executing the step. This allows the step to be executed in the environment provided by the step operator, such as AWS SageMaker. + +- **Example**: Here is an example of how to define a step with a step operator: + +```python +from zenml import step + +@step(step_operator="my_sagemaker_operator") +def my_training_step(...) -> ...: + # Step logic here + pass +``` + +- **Running the Pipeline**: Include the step in your pipeline and execute it. The specified step operator will handle the execution of the step in the designated environment. + +```python +from zenml import pipeline + +@pipeline +def my_pipeline(step): + step() + +my_pipeline(my_training_step=my_training_step).run() +``` + +This approach allows you to leverage the specialized compute resources and capabilities of the step operator for specific steps in your pipeline. + ```python from zenml import step diff --git a/docs/book/component-guide/step-operators/custom.md b/docs/book/component-guide/step-operators/custom.md index 41de85cbfde..abd5412be3c 100644 --- a/docs/book/component-guide/step-operators/custom.md +++ b/docs/book/component-guide/step-operators/custom.md @@ -80,6 +80,24 @@ If you want to create your own custom flavor for a step operator, you can follow 1. Create a class that inherits from the `BaseStepOperator` class and implement the abstract `launch` method. This method has two main responsibilities: * Preparing a suitable execution environment (e.g. a Docker image): The general environment is highly dependent on the concrete step operator implementation, but for ZenML to be able to run the step it requires you to install some `pip` dependencies. The list of requirements needed to successfully execute the step can be found via the Docker settings `info.pipeline.docker_settings` passed to the `launch()` method. Additionally, you'll have to make sure that all the source code of your ZenML step and pipeline are available within this execution environment. * Running the entrypoint command: Actually running a single step of a pipeline requires knowledge of many ZenML internals and is implemented in the `zenml.step_operators.step_operator_entrypoint_configuration` module. As long as your environment was set up correctly (see the previous bullet point), you can run the step using the command provided via the `entrypoint_command` argument of the `launch()` method. + +### Using Step Operators in Steps + +- **Step Operator Parameter**: The `step_operator` parameter in the `@step` decorator allows you to specify the step operator for executing the step. This enables the step to be executed in the environment provided by the step operator, such as a cloud service or a custom environment. + +- **Example**: + ```python + from zenml import step + + @step(step_operator="my_sagemaker_operator") + def my_training_step(...) -> ...: + # Step logic here + pass + ``` + In this example, `my_custom_operator` is the name of the step operator registered for a specific environment. + +- **Running the Pipeline**: Include the step in your pipeline and execute it. The specified step operator will handle the execution of the step in the designated environment. + 2. If your step operator allows the specification of per-step resources, make sure to handle the resources defined on the step (`info.config.resource_settings`) that was passed to the `launch()` method. 3. If you need to provide any configuration, create a class that inherits from the `BaseStepOperatorConfig` class adds your configuration parameters. 4. Bring both the implementation and the configuration together by inheriting from the `BaseStepOperatorFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. diff --git a/docs/book/component-guide/step-operators/kubernetes.md b/docs/book/component-guide/step-operators/kubernetes.md index 5a9d7aa190a..acccb87bba7 100644 --- a/docs/book/component-guide/step-operators/kubernetes.md +++ b/docs/book/component-guide/step-operators/kubernetes.md @@ -96,6 +96,25 @@ def trainer(...) -> ...: ZenML will build a Docker images which includes your code and use it to run your steps in Kubernetes. Check out [this page](https://docs.zenml.io/how-to/customize-docker-builds/) if you want to learn more about how ZenML builds these images and how you can customize them. {% endhint %} +### Using Step Operators in Steps + +The `step_operator` parameter in the `@step` decorator allows you to specify the step operator for executing the step. This is particularly useful for executing steps in different environments, such as AWS SageMaker. + +Example: + +```python +from zenml import step + +@step(step_operator="my_sagemaker_operator") +def my_training_step(...) -> ...: + # Step logic here + pass +``` + +In this example, `my_sagemaker_operator` is the name of the step operator registered for AWS SageMaker. The step will be executed in the environment provided by the step operator. + +When running the pipeline, the specified step operator manages the execution environment for the step, leveraging specialized compute resources and capabilities. + #### Interacting with pods via kubectl For debugging, it can sometimes be handy to interact with the Kubernetes pods directly via kubectl. To make this easier, we have added the following labels to all pods: diff --git a/docs/book/component-guide/step-operators/sagemaker.md b/docs/book/component-guide/step-operators/sagemaker.md index 10a3de1ebd3..f221e9259ef 100644 --- a/docs/book/component-guide/step-operators/sagemaker.md +++ b/docs/book/component-guide/step-operators/sagemaker.md @@ -33,6 +33,27 @@ To use the SageMaker step operator, we need: * An instance type that we want to execute our steps on. See [here](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-instance-types.html) for a list of available instance types. * (Optional) An experiment that is used to group SageMaker runs. Check [this guide](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments-create.html) to see how to create an experiment. +### Using Step Operators in Steps + +To use step operators within steps, you can specify the step operator in the `@step` decorator. This allows the step to be executed in the environment provided by the step operator, such as AWS SageMaker. + +- **Step Operator Parameter**: The `step_operator` parameter in the `@step` decorator specifies which step operator to use for executing the step. + +- **Example**: + + ```python + from zenml import step + + @step(step_operator="my_sagemaker_operator") + def my_training_step(...) -> ...: + # Step logic here + pass + ``` + +- **Running the Pipeline**: Include the step in your pipeline and execute it. The specified step operator will handle the execution of the step in the designated environment. + +This approach allows you to leverage the specialized compute resources and capabilities of the step operator for specific steps in your pipeline. + There are two ways you can authenticate your orchestrator to AWS to be able to run steps on SageMaker: {% tabs %} diff --git a/docs/book/component-guide/step-operators/spark-kubernetes.md b/docs/book/component-guide/step-operators/spark-kubernetes.md index c3d74348c85..f5418e7616d 100644 --- a/docs/book/component-guide/step-operators/spark-kubernetes.md +++ b/docs/book/component-guide/step-operators/spark-kubernetes.md @@ -338,6 +338,31 @@ def step_on_spark(...) -> ...: ``` {% endhint %} +### Using Step Operators in Steps + +To use step operators within steps, you can specify the step operator in the `@step` decorator. This allows the step to be executed in the environment provided by the step operator, such as Spark on Kubernetes. + +#### Explanation of the `step_operator` Parameter + +The `step_operator` parameter in the `@step` decorator specifies which step operator to use for executing the step. This enhances the flexibility and scalability of your pipelines by allowing individual steps to leverage specialized compute resources and capabilities. + +#### Example Code Snippet + +```python +from zenml import step + +@step(step_operator="spark_step_operator") +def my_training_step(...) -> ...: + # Step logic here + pass +``` + +In this example, `my_sagemaker_operator` is the name of the step operator registered for AWS SageMaker. + +#### Running the Pipeline + +Include the step in your pipeline and execute it. The specified step operator will handle the execution of the step in the designated environment. + ### Additional configuration For additional configuration of the Spark step operator, you can pass `SparkStepOperatorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-spark.html#zenml.integrations.spark) for a full list of available attributes and [this docs page](https://docs.zenml.io/how-to/pipeline-development/use-configuration-files/runtime-configuration) for more information on how to specify settings. diff --git a/docs/book/component-guide/toc.md b/docs/book/component-guide/toc.md index 751948d001c..b4ecdd28810 100644 --- a/docs/book/component-guide/toc.md +++ b/docs/book/component-guide/toc.md @@ -38,6 +38,7 @@ * [Kubernetes](step-operators/kubernetes.md) * [Modal](step-operators/modal.md) * [Spark](step-operators/spark-kubernetes.md) + * [Using Step Operators in Steps](step-operators/using-step-operators-in-steps.md) * [Develop a Custom Step Operator](step-operators/custom.md) * [Experiment Trackers](experiment-trackers/README.md) * [Comet](experiment-trackers/comet.md) diff --git a/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md b/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md index 29592207dd5..96bfd4fce69 100644 --- a/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md +++ b/docs/book/how-to/pipeline-development/use-configuration-files/runtime-configuration.md @@ -43,6 +43,44 @@ Even though settings can be overridden at runtime, you can also specify _default This means that all pipelines that run using this experiment tracker use nested MLflow runs unless overridden by specifying settings for the pipeline at runtime. +### Using Step Operators in Steps + +The `@step` decorator in ZenML allows you to specify a `step_operator` parameter, which is used to define the step operator responsible for executing the step. This is particularly useful for running steps in different environments, such as AWS SageMaker. + +#### Specifying Step Operators + +To specify a step operator, use the `step_operator` parameter in the `@step` decorator: + +```python +from zenml import step + +@step(step_operator="") +def my_step(...): + # Step logic here + pass +``` + +In this example, `` is the name of the step operator you have registered, such as an AWS SageMaker step operator. + +#### Example: Using AWS SageMaker + +Here's an example of how to define a step with a SageMaker step operator: + +```python +from zenml import step + +@step(step_operator="my_sagemaker_operator") +def trainer(...) -> ...: + """Train a model.""" + # This step will be executed in SageMaker. +``` + +#### Running the Pipeline + +When you run your pipeline, the specified step operator will manage the execution environment for the step, allowing you to leverage specialized compute resources and capabilities. + +This approach enhances the flexibility and scalability of your pipelines by enabling the execution of individual steps in different environments. + ### Using the right key for Stack-component-specific settings When specifying stack-component-specific settings, a key needs to be passed. This key should always correspond to the pattern: `` or `.`. If you specify just the category (e.g. `step_operator` or `orchestrator`), ZenML will try to apply those settings to whatever flavor of component is in your stack when running a pipeline. If your settings don't apply to this flavor, they will be ignored.