Skip to content

@kubernetes step decorator #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

oavdeev
Copy link
Collaborator

@oavdeev oavdeev commented Apr 27, 2021

Note that it is PR is on top of #504, since it relies on some mflog refactorings introduced there.

What does this do

Introduce a new decorator, @kubernetes that allows you to run specific steps on kubernetes as Jobs.

What works

Basics work. You can decorate steps with @kubernetes and they'll run on k8s cluster. --with kubernetes works too. Logging works through mflog.

You can specify a docker image as @kubernetes(image="python:3.7"). If the image is not specified, it uses METAFLOW_KUBERNETES_IMAGE_URI. If that is missing too, it'll use default python image. This behavior is essentially the same as @batch today.

This also introduces a dependency on Kubernetes Python client (only needed if you use @kubernetes).

Configuration and auth

K8S creds

It will use the k8s creds that are configured in your environment. That is, this doesn't do anything special for k8s auth; if kubectl works in your environment, this will too.

AWS creds

The code still uses S3, and assumes that AWS auth is somehow taken care of via your cluster configuration. That is, AWS creds are available either via the instance profile, or via IRSA.

What's left

  • support @timeout
  • support @resources
  • support @environment
  • make sure interrupting/cancelling a step works
  • more testing for edge cases and failure modes

@oavdeev oavdeev changed the title @kubernetes step decorator [DRAFT] @kubernetes step decorator Apr 27, 2021
@oavdeev oavdeev force-pushed the k8s-executor branch 4 times, most recently from 0742ea7 to 7ff6be8 Compare April 28, 2021 00:48
@savingoyal savingoyal self-requested a review April 28, 2021 02:18
Comment on lines 43 to 92
from kubernetes import client

# Configureate Pod template container
container = client.V1Container(
name=job_name,
image=image,
command=command,
env=[client.V1EnvVar(name=k, value=v) for k, v in env.items()],
)
# Create and configurate a spec section
template = client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "pi"}),
spec=client.V1PodSpec(restart_policy="Never", containers=[container]),
)
# Create the specification of deployment
spec = client.V1JobSpec(
template=template,
backoff_limit=1, # retries handled by Metaflow
ttl_seconds_after_finished=600,
)

# Instantiate the job object
job = client.V1Job(
api_version="batch/v1",
kind="Job",
metadata=client.V1ObjectMeta(name=job_name, labels=labels),
spec=spec,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8s python client accepts specifications written as dictionaries, then it looks identical to the k8s json spec. The job could be created like this:

job = {
    "apiVersion": "batch/v1",
    "kind": "Job",
    "metadata": {
        "name": job_name,
        "labels": labels
    },
    "spec": {
        "template": {
            "metadata": {
                "labels": {
                    "app": "pi",
                },
            },
            "spec": {
                "containers": [
                    {
                        "name": job_name,
                        "image": image,
                        "command": command,
                        "env": [{"name": k, "value": v} for k, v in env.items()],
                    },
                ],
                "restartPolicy": "Never"
            }
        },
        "backoffLimit": 1,      # retries handled by Metaflow
        "ttlSecondsAfterFinished": 600
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, didn't know that. One thing I like about client. API is that VSCode/mypy complains when I make a typo in a parameter name. For example I always forget that "template" {...} bit when writing YAML/JSON.

@oavdeev oavdeev force-pushed the k8s-executor branch 2 times, most recently from 502c6c4 to 9e96c68 Compare May 14, 2021 00:43
@oavdeev oavdeev marked this pull request as ready for review May 14, 2021 02:17
@savingoyal savingoyal linked an issue Aug 3, 2021 that may be closed by this pull request
@oavdeev
Copy link
Collaborator Author

oavdeev commented Sep 24, 2021

closing in favor of #644

@oavdeev oavdeev closed this Sep 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Kubernetes (with Argo)
3 participants