Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart Resource Management #6

Open
averagehat opened this issue Aug 14, 2019 · 2 comments
Open

Smart Resource Management #6

averagehat opened this issue Aug 14, 2019 · 2 comments

Comments

@averagehat
Copy link
Contributor

It would be useful to define the expected resource usage of a given stage, i.e. memory, cores, etc., so that a pipeline does not attempt to use more than what is available (this is particularly helpful when tools have high memory usage).

There is some support for this in Shake already:
https://hackage.haskell.org/package/shake-0.18.3/docs/Development-Shake.html#t:Resource

And here is another example:
https://github.com/fulcrumgenomics/dagr#resource-management

@jbedo
Copy link
Member

jbedo commented Aug 15, 2019 via email

@averagehat
Copy link
Contributor Author

Mainly RAM. For example, we have a step of our pipeline that takes 60GB+ RAM, and then RAM usage contracts greatly. We want to run many instances of the pipeline in parallel on a single machine, which is only possible if the memory usage is staggered.

It would also be useful to consider time (skip optional steps if there is not enough time) and disk space (in our case, if the pipeline expects to run out of space we can write to a slower networked drive). I'd suggest these latter two be optionally provided data for now, perhaps with some generic functionality for handling them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants