-
Notifications
You must be signed in to change notification settings - Fork 778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand @schedule to trigger based on external events, such as changes to AWS S3 bucket #468
Comments
In addition to expanding @rapuckett Can you also elaborate a bit on your specific use case so that we can make sure it's covered in any enhancements we make to |
My specific use case involves triggering a SFN-based run when new data is deposited into some location in an S3 bucket (eg: s3://my-data-bucket/ocr.csv). To accomplish this I've created an S3 event notification on the
The idea is that the Lambda and S3 event trigger (or EventBridge Rule) would be created automatically when an SFN flow gets created, thus removing the need for the ML engineer to have to repeatedly create these resources by hand. I'm not sure how the details might change for, say, GCP or Azure, but an example
Where URI would be 's3://my-data-bucket/ocr.csv' in my case, and Hope this makes sense. |
Makes sense. I will follow up with a design proposal. |
related ticket: #280 |
This is awesome! It'd be great if the trigger is "meta" and can work for other orchestrators such as Argo and KFP. related: #245 |
Hello! Is this feature being worked on? It sounds like a very good feature |
#1271 introduces the basics to support this feature. |
Right now, it's possible to trigger a Metaflow SFN run by manually creating a Lambda triggered by an EventBridge rule (in my current use case, it happens when new data uploaded to an S3 bucket). This process is manual and will potentially involve a lot of boilerplate code, so a FlowSpec-level @schedule decorator would be great for setting this up for each Flow that gets deployed to production.
The decorator would ideally be abstract enough to work on a variety of resource events (eg: S3, RDS) and also not be bound to AWS, specifically - likely by leveraging plugins.
The text was updated successfully, but these errors were encountered: