Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a class to model parts of an operation's memory usage #288

Open
Tracked by #339
tomwhite opened this issue Aug 1, 2023 · 0 comments
Open
Tracked by #339

Introduce a class to model parts of an operation's memory usage #288

tomwhite opened this issue Aug 1, 2023 · 0 comments

Comments

@tomwhite
Copy link
Member

tomwhite commented Aug 1, 2023

Consider a simple blockwise operation with one input, where each task carries out the following steps:

  1. read compressed Zarr chunk
  2. decompress Zarr chunk to produce the input array
  3. apply the operation to produce the output array (which may or may not be a different instance from 2.)
  4. write compressed Zarr chunk

We currently model this as using four times the size of the chunk (see explanation here), which is supported by examining the memory usage in tools like Fil.

If there are multiple inputs then things get more complicated depending on how the operation allocates memory, and if earlier inputs are freed before later ones are read. But there are a fairly small number of categories of operation, so it should be possible to model what they do.

The point of modelling this would be to work out the projected memory usage for operations that have been fused. For example, fusing two operations means that the intermediate Zarr file is not written, so these parts of the memory usage can be dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant