Introduce a class to model parts of an operation's memory usage #288

tomwhite · 2023-08-01T14:48:36Z

Consider a simple blockwise operation with one input, where each task carries out the following steps:

read compressed Zarr chunk
decompress Zarr chunk to produce the input array
apply the operation to produce the output array (which may or may not be a different instance from 2.)
write compressed Zarr chunk

We currently model this as using four times the size of the chunk (see explanation here), which is supported by examining the memory usage in tools like Fil.

If there are multiple inputs then things get more complicated depending on how the operation allocates memory, and if earlier inputs are freed before later ones are read. But there are a fairly small number of categories of operation, so it should be possible to model what they do.

The point of modelling this would be to work out the projected memory usage for operations that have been fused. For example, fusing two operations means that the intermediate Zarr file is not written, so these parts of the memory usage can be dropped.

tomwhite added memory optimization labels Aug 1, 2023

tomwhite mentioned this issue Dec 19, 2023

Optimization tracking issue #339

Open

20 tasks

tomwhite mentioned this issue Aug 1, 2024

Add scan. #531

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a class to model parts of an operation's memory usage #288

Introduce a class to model parts of an operation's memory usage #288

tomwhite commented Aug 1, 2023

Introduce a class to model parts of an operation's memory usage #288

Introduce a class to model parts of an operation's memory usage #288

Comments

tomwhite commented Aug 1, 2023