New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Concurrent task iteration support #422

Open

joshbeckman opened this issue May 26, 2021 · 16 comments

Labels

joshbeckman commented May 26, 2021 •

edited

Loading

Over at https://github.com/shopify/flow we have been trying to adopt the maintenance task framework and have enjoyed the benefits for our small data migrations, but our main hangup is the loong runtimes of tasks that need to operate on large datasets (e.g. all records in one table table - tens of thousands now, will be much more the future). When we tried running a recent data migration via maintenance task recently, the total time to execute would have been months.

As such, our main desire with this library would be declarative concurrency support. Is #325 (comment) still the recommendation for concurrency in the future of this library?

No immediate need for action on this - we just wanted to provide feedback on our adoption!

Member

etiennebarrie commented May 27, 2021

Do you think batches enumerators could help? (see #409)
Depending on your tasks, being able to update 100/1000 records at a time could substantially speed them up.

Regarding actual parallelism when running tasks, it's something that we're thinking about but we haven't made any formal plans so we can't make any promise. We can keep this issue open to continue thinking about it, start fleshing out an API, behaviour, figure out the edge cases (e.g. it will require special handling for custom enumerators which may not have a way to start a cursor randomly, but only give out one item at a time), etc.

etiennebarrie mentioned this issue

Support Tasks with custom parameters #413

Merged

1 task

Author

joshbeckman commented May 27, 2021

Batches could help with some of our task types, yes!

But we have other types of tasks that require, for example, calling an external API with an individual record and then saving that value to our database, so the batching would remove some of the overhead of the job queue itself but wouldn't give us the speed up that we would get from concurrency.

adrianna-chang-shopify added the enhancement label

sle-c commented Mar 29, 2023

I recently ran a migration on flow which mainly involves making graphQL requests to core for certain things. Processing 874k rows would take about 7 days to complete. I think allowing parallelism really helps in these cases.

github-actions bot commented Jan 27, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

Author

joshbeckman commented Jan 29, 2024

We would still like this!

github-actions bot removed the stale label

github-actions bot commented Mar 31, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

Author

joshbeckman commented Apr 1, 2024

We would still really like this

github-actions bot removed the stale label

segiddins commented Apr 16, 2024

This would be incredibly useful.

github-actions bot commented Jun 16, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

segiddins commented Jun 16, 2024

Still valid

github-actions bot removed the stale label

github-actions bot commented Aug 16, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

segiddins commented Aug 16, 2024

still relevant

github-actions bot removed the stale label

github-actions bot commented Oct 16, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

Author

joshbeckman commented Oct 16, 2024

Still, I would love this.

github-actions bot removed the stale label

github-actions bot commented Dec 16, 2024

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

github-actions bot added the stale label

Author

joshbeckman commented Dec 16, 2024

Still want this.

github-actions bot removed the stale label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment