-
Notifications
You must be signed in to change notification settings - Fork 788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for AWS Step functions #2
Comments
I just arrived to Metaflow and I'm thrilled to give it a try in my company. |
Hi @gonzalodiaz |
Is your team familiar with https://github.com/argoproj/argo? In theory you could compile your Flows down into Argo's workflow spec format (JSON/YAML) and then Argo could take care of execution. |
Thanks for the link. Yes I am familiar with argo but haven’t looked at it in depth. |
Given that Metaflow is evidently seriously lacking a scheduler, either Step Functions or better yet an open source component of Metaflow itself can probably fill in the gap. Without a scheduler, indeed it seems to be an incomplete solution. |
For step function integration, is it possible to incorporate https://github.com/aws/aws-step-functions-data-science-sdk-python? |
As an observer, I don't see any need for AWS Step Functions integration since Metaflow should be able to manage workflow steps directly. Why pay extra for Step Functions? |
AWS Step Functions need to be scheduled through CloudWatch. They do not have an in-built scheduler. However, CloudWatch has a direct integration with Step Functions. It might be better to look into how CloudWatch + Lambda may be leveraged to act as a scheduler for Metaflow, separate from Step Functions.
Metaflow is an orchestrator itself so I think the only missing piece is to figure out the scheduling aspect. Using Step Functions as an orchestrator just because we need it to schedule Metaflow workflows is an overkill, IMO. |
Metaflow could in principle then manage those Cloudwatch Events and Lambdas too using a single combined job+schedule definition. This would be the simplest scheduler integration assuming one cannot be built-in or integrated into Metaflow directly. I would still prefer the integration and use of an open source scheduler into Metaflow though to avoid the reliance on Cloudwatch Events and Lambdas. |
Also maybe check out Glue Workflows which are a little more DAG-like compared to the Step Functions model https://docs.aws.amazon.com/glue/latest/dg/workflows_overview.html |
I'm confused about this statement:
Is this saying that there isn't yet a way to schedule a flow to run in production, or that there's no DAG schedulor/executor to actually run a flow in a production setting? Thank you. |
@kylejmcintyre Internally we export metaflow flows to a DAG scheduler. A similar integration with AWS Step Functions is in the works. |
Thanks for your reply @savingoyal . Is what you do internally available to me as an open-source consumer? If so, is it considered a hidden/internal implementation detail currently that runs on my provisioned compute resources? Or is executing flows in a production setting not yet supported for folks outside of Netflix? |
@kylejmcintyre Given that the DAG scheduler (Meson) we use internally is not an open-source project, we are working on an equivalent integration with AWS Step Functions to offer similar capabilities in metaflow OSS as we speak. |
Why is AWS Step Functions even needed then? It's just going to increase the bill by doing something that open source software can do for free. The real hardware which is needed is provided by AWS Batch/EC2/ECS and similar services. |
@impredicative There are not very many production-grade DAG schedulers (no SPOF, HA, scalable) with good adoption in the open-source community. AWS Step Functions offers the guarantees that we seek from a production-grade scheduler and our integration can serve as a reference implementation for integrations with other schedulers. |
I am interested in using this project but the obvious blocker is the scheduler. @savingoyal: do you have a rough timeline when it could be available? |
As it has been noted in this issues before, Step Functions use Cloudwatch Events for scheduling. Has this changed? If not, why is Step Functions being referred to a scheduler? Does Metaflow then really still need Step Functions integration, or is it Cloudwatch Events integration that it needs? |
@impredicative AWS Step Functions is a scheduler as it schedules tasks on AWS Batch. The state machine by itself can be triggered by using CloudWatch Events. Metaflow is not meant to be a replacement for a production-grade scheduler and through our integrations, we advocate that users publish their production workflows onto a production scheduler. |
add EFA support for step-functions
Metaflow on AWS currently requires a human-in-the-loop to execute and cannot automatically be scheduled. Metaflow could be made to work with AWS Step functions to allow the orchestration of Metaflow steps to be done by AWS.
The text was updated successfully, but these errors were encountered: