21 Jun 15:59

saikonen

4fdec0d

2.9.6

Features

AWS Step Function state machines can now be deleted through the CLI

This release introduces the command step-functions delete for deleting state machines through the CLI.

For a regular flow

python flow.py step-functions delete

For another users project branch

Comment out the @project decorator from the flow file, as we do not allow using --name with projects.

python project_flow.py step-functions --name project_a.user.saikonen.ProjectFlow delete

For a production or custom branch flow

python project_flow.py --production step-functions delete
# or
python project_flow.py --branch custom step-functions delete

add --authorize PRODUCTION_TOKEN to the command if you do not have the correct production token locally

Improvements

Fixes a bug with the S3 server side encryption feature with some S3 compliant providers.

This release fixes an issue with the S3 server side encryption support, where some S3 compliant providers do not respond with the expected encryption method in the payload. This bug specifically affected regular operation when using MinIO.

Fixes support for `--with environment` in Airflow

Fixes a bug with the Airflow support for environment variables, where the env values set in the environment decorator could get overwritten.

What's Changed

[bugfix] support --with environment in Airflow by @valayDave in #1459
feat: sfn delete workflow (with prod token validation and messaging) by @stevenhoelscher, @saikonen in #1379
[bugfix]: Optional check for encryption in s3op response by @valayDave in #1460
Bump version to 2.9.6 by @saikonen in #1461

Full Changelog: 2.9.5...2.9.6

Contributors

valayDave, saikonen, and stevenhoelscher

Assets 2

15 Jun 18:25

saikonen

2.9.5

3d4f85a

2.9.5

Features

Ability to choose server side encryption method for S3 uploads

There is now the possibility to choose which server side encryption method to use for S3 uploads by setting an environment variable METAFLOW_S3_SERVER_SIDE_ENCRYPTION with an appropriate value, for example aws:kms or AES256

Improvements

Fixes double quotes with Parameters on Argo Workflows

This release fixes an issue where using parameters on Argo Workflows caused the values to be unnecessarily quoted.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

feat: ability to use ServerSideEncryption for S3 uploads by @zendesk-klross in #1436
fix quoting issue with argo by @savingoyal in #1456
Bump version to 2.9.5 by @saikonen in #1457

New Contributors

@zendesk-klross made their first contribution in #1436

Full Changelog: 2.9.4...2.9.5

Contributors

savingoyal, saikonen, and zendesk-klross

Assets 2

12 Jun 19:27

saikonen

2.9.4

4a3f51a

2.9.4

Improvements

Fix using email addresses as usernames for Argo Workflows

Using an email address as the username when deploying with a @project decorator to Argo Workflows is now possible. This release fixes an issue with some generated resources containing characters that are not permitted in names of Argo Workflow resources.

The `secrets` decorator now supports assuming roles

This release adds the capability to assume specific roles when accessing secrets with the @secrets decorator. The role for accessing a secret can be defined in the following ways

As a global default

By setting the METAFLOW_DEFAULT_SECRET_ROLE environment variable, this role will be assumed when accessing any secret specified in the decorator.

As a global option in the decorator

This will assume the role secret-iam-role for accessing all of the secrets in the sources list.

@secrets(
  sources=["first-secret-source", "second-secret-source"],
  role="secret-iam-role"
)

Or on a per secret basis

Assuming a different role based on the secret in question can be done as well

@secrets(
  sources=[
    {"type": "aws-secrets-manager", "id": "first-secret-source", "role": "first-secret-role"},
    {"type": "aws-secrets-manager", "id": "second-secret-source", "role": "second-secret-role"}
  ]
)

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

[OBP] support assuming roles to read secrets by @jackie-ob in #1418
fix two docstrings that make API docs unhappy by @tuulos in #1441
Properly validate a config value against the type of its default by @romain-intel in #1426
Add additional options to @trigger and @trigger_on_finish by @romain-intel in #1398
Wrap errors importing over the escape hatch as ImportError by @romain-intel in #1446
Setting default time for files in code package to Dec 3, 2019 by @pjoshi30 in #1445
Fix issue with handling of exceptions in the escape hatch by @romain-intel in #1444
fix: support email in argo workflow names by @saikonen in #1448
fix: email naming support for argo events by @saikonen in #1450
bump version to 2.9.4 by @saikonen in #1451

Full Changelog: 2.9.3...2.9.4

Contributors

tuulos, pjoshi30, and 4 other contributors

Assets 2

01 Jun 18:13

romain-intel

2.9.3

68a27b6

2.9.3

Improvements

Ignore duplicate Metaflow Extensions packages

Duplicate Metaflow Extensions packages were not properly ignored in all cases. This release fixes this and will allow the loading of extensions even if they are present in duplicate form in your sys.path.

Fix package leaks for the environment escape

In some cases, packages from the outside environment (non Conda) could leak into the Conda environment when using the environment escape functionality. This release addresses this issue and ensures that no spurious packages are imported in the Conda environment.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

Update README.md by @savingoyal in #1431
Add labels and fix argo by @dhpollack in #1360
Update KubernetesDecorator class docstring to include persistent_volume_claims by @tfurmston in #1435
Properly ignore a duplicate metaflow extension package in sys.path by @romain-intel in #1437
Fix an issue with the escape hatch that could cause outside packages to "leak" by @romain-intel in #1439
Bump version to 2.9.3 by @romain-intel in #1440

Full Changelog: 2.9.2...2.9.3

Contributors

dhpollack, savingoyal, and 2 other contributors

Assets 2

22 May 16:17

savingoyal

2.9.2

4c3b771

2.9.2

Features
- Introduce support for image pull policy for @kubernetes

Features

Introduce support for image pull policy for @kubernetes

With this release, Metaflow users can specify image pull policy for their workloads through the @kubernetes decorator for Metaflow tasks.

@kubernetes(image='foo:tag', image_pull_policy='Always') # Allowed values are Always, IfNotPresent, Never
@step
def train(self):
    ... 
    ...

If an image pull policy is not specified, and the tag for the container image is :latest or the tag for the container image is not specified, image pull policy is automatically set to Always.

If an image pull policy is not specified, and the tag for the container image is specified as a value that is not :latest, image pull policy is automatically set to IfNotPresent.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

introduce support for intra-cluster webhook url by @savingoyal in #1417
add and improve docstrings for event-triggering by @tuulos in #1419
Update readme by @emattia in #1397
Update README.md by @savingoyal in #1422
fix includefile for argo-workflows by @savingoyal in #1428
feature: support configuring image pull policy for Kubernetes and Argo Workflows by @saikonen in #1427
fix error message by @savingoyal in #1429
Update to 2.9.2 by @savingoyal in #1430

Full Changelog: 2.9.1...2.9.2

Contributors

tuulos, savingoyal, and 3 other contributors

Assets 2

16 May 02:15

savingoyal

2.9.1

86b99fe

2.9.1

Features
- Introduce Slack notifications support for workflow running on Argo Workflows

Features

Introduce Slack notifications support for workflow running on Argo Workflows

With this release, Metaflow users can get notified on Slack when their workflows succeed or fail on Argo Workflows. Using this feature is quite straightforward

Follow these instructions on Slack to set up incoming webhooks for your Slack workspace.
You should now have a webhook URL that Slack provides. Here is an example webhook:
```
https://hooks.slack.com/services/T0XXXXXXXXX/B0XXXXXXXXX/qZXXXXXX
```
To enable notifications on Slack when your Metaflow flow running on Argo Workflows succeeds or fails, deploy it using the --notify-on-error or --notify-on-success flags:
```
python flow.py argo-workflows create --notify-on-error --notify-on-success --notify-slack-webhook-url <slack-webhook-url>
```
You can also set METAFLOW_ARGO_WORKFLOWS_CREATE_NOTIFY_SLACK_WEBHOOK_URL=<slack-webhook-url> in your environment instead of specifying --notify-slack-webhook-url on the CLI everytime.
Next time your workflow succeeds or fails on Argo Workflows, you will get a helpful notification on Slack.

FAQ

I deployed my workflow following the instructions above, but I haven’t received any notifications yet?

This issue may very well happen if you are running Kubernetes v1.24 or newer.

Since v1.24, Kubernetes stopped automatically creating a secret for every serviceAccount. Argo Workflows relies on the existence of these secrets to run lifecycle hooks responsible for the emission of these notifications.

Follow these steps for explicitly creating a secret for the service account that responsible for executing Argo Workflows steps:

Run the following command, replacing service-account.name with the serviceAccount in your deployment. Also change the name of the secret to correctly reflect the name of the _serviceAccount _for which this secret is

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: default-sa-token #change according to the name of the sa
  annotations:
    kubernetes.io/service-account.name: default #replace with your sa
type: kubernetes.io/service-account-token
EOF

Edit the serviceAccount object so as to add the name of the above secret in it. You can use kubectl edit for this. The serviceAccount yaml should look like the following

$ kubectl edit sa default -n mynamespace
...
apiVersion: v1
kind: ServiceAccount
metadata:
  creationTimestamp: "2023-05-05T20:58:58Z"
  name: default
  namespace: jobs-default
  resourceVersion: "6739507"
  uid: 4a708eff-d6ba-4dd8-80ee-8fb3c4c1e1c7
secrets:
- name: default-sa-token # should match the secret above

That’s it! Try executing your workflow again on Argo Workflows. If you are still running into issues, reach out to us!

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

feature: add argo events environment variables to metaflow configure kubernetes by @saikonen in #1405
handle whitespaces in argo events parameters by @savingoyal in #1408
Add back comment for argo workflows by @savingoyal in #1409
Support ArgoEvent object with @kubernetes by @savingoyal in #1410
Print workflow template location as part of argo-workflows create by @savingoyal in #1411

Full Changelog: 2.8.6...2.9.0

Contributors

savingoyal, kubernetes, and saikonen

Assets 2

16 May 00:39

savingoyal

2.9.0

4965335

2.9.0

Features
- Introduce support for composing multiple interrelated workflows through external events

Features

Introduce support for composing multiple interrelated workflows through external events

With this release, Metaflow users can architect sequences of workflows that conduct data across teams, all the way from ETL and data warehouse to final ML outputs. Detailed documentation and a blog post to follow very shortly! Keep watching this space.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

feature: add argo events environment variables to metaflow configure kubernetes by @saikonen in #1405
handle whitespaces in argo events parameters by @savingoyal in #1408
Add back comment for argo workflows by @savingoyal in #1409
Support ArgoEvent object with @kubernetes by @savingoyal in #1410
Print workflow template location as part of argo-workflows create by @savingoyal in #1411

Full Changelog: 2.8.6...2.9.0

Contributors

savingoyal, kubernetes, and saikonen

Assets 2

10 May 17:51

savingoyal

2.8.6

0386164

2.8.6

Features
- Introduce support for persistent volume claims for executions on Kubernetes

Features

Introduce support for persistent volume claims for executions on Kubernetes

With this release, Metaflow users can attach existing persistent volume claims to Metaflow tasks running on a Kubernetes cluster.

To use this functionality, simply list your persistent volume claim and mount point using the persistent_volume_claims arg in @kubernetes decorator - @kubernetes(persistent_volume_claims={"pvc-claim-name": "mount-point", "another-pvc-claim-name": "another-mount-point"}).

Here is an example:

from metaflow import FlowSpec, step, kubernetes, current
import os

class MountPVCFlow(FlowSpec):

    @kubernetes(persistent_volume_claims={"test-pvc-feature-claim": "/mnt/testvol"})
    @step
    def start(self):
        print('testing PVC')
        mount = "/mnt/testvol"
        file = f"zeros_run_{current.run_id}"
        with open(os.path.join(mount, file), "w+") as f:
            f.write("\0" * 50)
            f.flush()
        
        print(f"mount folder contents: {os.listdir(mount)}")
        self.next(self.end)

    @step
    def end(self):
        print("finished")

if __name__=="__main__":
    MountPVCFlow()

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

handle bools properly for argo-workflows task runtime cli by @savingoyal in #1395
fix: migrate R support to use importlib by @saikonen in #1396
Add configuration of username from metaflow_config.py by @tfurmston in #1400
feature: add Kubernetes support for PVC mounts by @saikonen in #1402
Update version to 2.8.6 by @savingoyal in #1404

Full Changelog: 2.8.5...2.8.6

Contributors

savingoyal, tfurmston, and 2 other contributors

Assets 2

05 May 22:53

romain-intel

2.8.5

e47f5d7

2.8.5

Improvements

Make pickled Metaflow client objects accessible across namespaces

Improvements

Make pickled Metaflow client objects accessible across namespaces

The previous release resulted in disabling a sequence of user operations that worked previously:

Pickle a Metaflow object
Access this Metaflow object in a different namespace
Access a child or parent object of this object

This release restores the previous behavior.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

feature: add sanitization for batch tags by @saikonen in #1376
fix: make metaflow config aware of profile environment variable by @saikonen in #1391
Fix an issue introduced in 2.8.4 that prevented pickled MetaflowObjec… by @romain-intel in #1392
Updating version to 2.8.5 by @pjoshi30 in #1393

Full Changelog: 2.8.4...2.8.5

Contributors

pjoshi30, romain-intel, and saikonen

Assets 2

03 May 01:54

savingoyal

2.8.4

61dbc58

2.8.4

Features
- Introduce support for tmpfs for executions on Kubernetes
- Introduce current.run and current.task_ in current singleton
Improvements
- Make metaflow client objects backward compatible

Features

Introduce support for tmpfs for executions on Kubernetes

It is typical for the user code in a Metaflow step to download assets from an object store, e.g. S3. Examples include serialized models and raw input data, such unstructured media or structured Parquet files. The amount of data loaded in a task is typically 10-100GB, allowing even terabytes to be handled in a foreach.

To reduce IO bottlenecks in such tasks, we provide an optimized client for S3, metaflow.S3 that makes it possible to download data using all available network bandwidth. Notably, in a modern instance the available network bandwidth can be higher than the local disk bandwidth. Consider: SATA 3.0 provides 6Gbit/s whereas a large instance can have 20Gbit/s network throughput. Even Gen3 NVMe provides just 16Git/s. To benefit from the full network bandwidth, local disk IO must be bypassed. The metaflow.S3 client accomplishes this by relying on the page cache: Nominally files are downloaded in a temporary directory on disk but practically all data stays in the page cache. This is assuming that the downloaded data can fit in memory, which can be ensured by having a high enough @resources(memory=) setting.

The above setup, which can provide excellent IO performance in general, has a small gotcha: The instance needs to have enough local disk space to back all the data, although no data actually hits the disk. Increasingly, instances may have more memory than local disk space available, so this superfluous requirement becomes a problem. This puts users in a strange situation: The instance has enough RAM to hold all the data in memory, and there are ways to download it quickly from S3, but the lack of local disk space (that is not even needed), makes it impossible to access the data.

Kubernetes supports mounting a tmpfs filesystem on the fly. Using this feature, the user can create a memory-backed file system which can be used as a temporary space for downloaded data. This removes the need to have to deal with any local disks. One can simply use a minimal root filesystem, which greatly simplifies the infrastructure setup.

With this release, we introduce a new config option - METAFLOW_TEMPDIR, which, if defined, is used as the default metaflow.S3(tmproot). If METAFLOW_TEMPDIR is not defined, tmproot=’.’ as before. In addition, a few new attributes are introduced for @kubernetes decorator -

Attribute (default)	Default behavior	Override semantics
use_tmpfs=False	tmpfs disabled	use_tmpfs=True enables tmpfs
tmpfs_tempdir=True	sets METAFLOW_TEMPDIR=tmpfs_path	tmpfs_tempdir=False doesn't set METAFLOW_TEMPDIR
tmpfs_size=None	sets tmpfs size to 50% of @resources(memory)	tmpfs size in megabytes
tmpfs_path=None	use /metaflow_temp as tmpfs_path	custom mount point

Examples

Handle large amounts of data in-memory with Kubernetes:

@kubernetes(memory=100000, use_tmpfs=True)

In this case, at most 50GB is available for tmpfs and it is used by S3 by default. Note that tmpfs only consumes the amount of memory corresponding to the data stored, so there is no downside in setting a large size by default.

Increase tmpfs size:

@kubernetes(memory=100000, tmpfs_size=100000)

Let tmpfs use all available memory. Note that use_tmpfs=True doesn’t have to be specified redundantly.

Custom tmpfs use case:

@kubernetes(memory=100000, tmpfs_size=10000, tmpfs_path=’/data’, tmpfs_tempdir=False)

Full control over settings - metaflow.S3 doesn’t use the tmpfs volume in this case.

Besides metaflow.S3, the user may want to use the tmpfs volume for their own use cases. In particular, many modern ML libraries require a local cache. To support these use cases, tmpfs_path is exposed through the current object, as current.tempdir.
This allows the user to leverage the volume straightforwardly:

AutoModelForSeq2SeqLM.from_pretrained(
            model_path,
            cache_dir=current.tempdir,
            device_map='auto',
            load_in_8bit=True,
        )

Introduce current.run and current.task_ in current singleton

With this release, you can access current.run and current.task within a running flow, allowing for use cases like

from metaflow import current

# add tags from inside a run
current.run.add_tag('foobar')

Improvements

Make metaflow client objects backward compatible

The previous release broke backward compatibility in cases where the metaflow client object is deserialized from an older version of Metaflow. This release preserves the functionality and provides explicit compatibility guarantees going forward.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

Fix: Check all steps for MetaflowCode and return if any by @bsridatta in #1338
chore: comment on run.code by @saikonen in #1357
Add kubernetes labels by @dhpollack in #1236
Revert "Add kubernetes labels" by @savingoyal in #1359
fix: batch tmpfs enabling logic by @saikonen in #1365
feature: tmpfs for kubernetes and argo by @saikonen in #1361
Fix: Validate pathspec argument for MetaflowObject by @bsridatta in #1350
Fix: METAFLOW_S3_ENDPOINT_URL as a part of airflow by @valayDave in #1368
Introduce support for event-triggered workflows by @savingoyal in #1271
feature: remove pylint dependency by @saikonen in #1378
Fixing a MetaflowObject backward compatibility issue by @pjoshi30 in #1363
added missing return statement by @felipeGarciaDiaz in #1383
fix: batch decorator missing metadata handling by @saikonen in #1385
mute argo event emmission by @savingoyal in #1386
Update current object adding run and task object. by @romain-intel in #1384
release 2.8.4 by @savingoyal in #1388

New Contributors

@dhpollack made their first contribution in #1236
@felipeGarciaDiaz made their first contribution in #1383

Full Changelog: 2.8.3...2.8.4

Contributors

dhpollack, pjoshi30, and 7 other contributors

Assets 2

Releases: Netflix/metaflow

2.9.6

Features

AWS Step Function state machines can now be deleted through the CLI

For a regular flow

For another users project branch

For a production or custom branch flow

Improvements

Fixes a bug with the S3 server side encryption feature with some S3 compliant providers.

Fixes support for --with environment in Airflow

What's Changed

Contributors

2.9.5

Features

Ability to choose server side encryption method for S3 uploads

Improvements

Fixes double quotes with Parameters on Argo Workflows

What's Changed

New Contributors

Contributors

2.9.4

Improvements

Fix using email addresses as usernames for Argo Workflows

The secrets decorator now supports assuming roles

As a global default

As a global option in the decorator

Or on a per secret basis

What's Changed

Contributors

2.9.3

Improvements

Ignore duplicate Metaflow Extensions packages

Fix package leaks for the environment escape

What's Changed

Contributors

2.9.2

Features

Introduce support for image pull policy for @kubernetes

What's Changed

Contributors

2.9.1

Features

Introduce Slack notifications support for workflow running on Argo Workflows

FAQ

What's Changed

Contributors

2.9.0

Features

Introduce support for composing multiple interrelated workflows through external events

What's Changed

Contributors

2.8.6

Features

Introduce support for persistent volume claims for executions on Kubernetes

What's Changed

Contributors

2.8.5

Improvements

Make pickled Metaflow client objects accessible across namespaces

What's Changed

Contributors

2.8.4

Features

Introduce support for tmpfs for executions on Kubernetes

Examples

Handle large amounts of data in-memory with Kubernetes:

Increase tmpfs size:

Custom tmpfs use case:

Introduce current.run and current.task_ in current singleton

Improvements

Make metaflow client objects backward compatible

What's Changed

New Contributors

Contributors

Fixes support for `--with environment` in Airflow

The `secrets` decorator now supports assuming roles