Skip to content

Commit

Permalink
Document map_selection core operation (#617)
Browse files Browse the repository at this point in the history
* Document `map_selection` core operation.

* Rework dependency tree diagram for core and primitive ops to include `map_selection`

* Update five-layer design diagram

* Add rechunk apidoc

* Add note about `general_blockwise`

* Fix double backticks
  • Loading branch information
tomwhite authored Nov 16, 2024
1 parent c75556f commit b3396a9
Show file tree
Hide file tree
Showing 8 changed files with 175 additions and 154 deletions.
34 changes: 34 additions & 0 deletions cubed/core/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,28 @@ def map_selection(
max_num_input_blocks,
**kwargs,
) -> "Array":
"""
Apply a function to selected subsets of an input array using standard NumPy indexing notation.
Parameters
----------
func : callable
Function to apply to every block to produce the output array.
Must accept ``block_id`` as a keyword argument (with same meaning as for ``map_blocks``).
selection_function : callable
A function that maps an output chunk key to one or more selections on the input array.
x: Array
The input array.
shape : tuple
Shape of the output array.
dtype : np.dtype
The ``dtype`` of the output array.
chunks : tuple
Chunk shape of blocks in the output array.
max_num_input_blocks : int
The maximum number of input blocks read from the input array.
"""

def key_function(out_key):
# compute the selection on x required to get the relevant chunk for out_key
in_sel = selection_function(out_key)
Expand Down Expand Up @@ -1009,6 +1031,18 @@ def wrap(*a, block_id=None, **kw):


def rechunk(x, chunks, target_store=None):
"""Change the chunking of an array without changing its shape or data.
Parameters
----------
chunks : tuple
The desired chunks of the array after rechunking.
Returns
-------
cubed.Array
An array with the desired chunks.
"""
if isinstance(chunks, dict):
chunks = {validate_axis(c, x.ndim): v for c, v in chunks.items()}
for i in range(x.ndim):
Expand Down
7 changes: 3 additions & 4 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Cubed is composed of five layers: from the storage layer at the bottom, to the A

![Five layer diagram](images/design.svg)

Blue blocks are implemented in Cubed, green in Rechunker, and red in other projects like Zarr and Beam.
Blue blocks are implemented in Cubed; red blocks in other projects like Zarr and Lithops.

Let's go through the layers from the bottom:

Expand All @@ -14,7 +14,7 @@ Every _array_ in Cubed is backed by a Zarr array. This means that the array type

## Runtime

Cubed uses external runtimes for computation. It follows the Rechunker model (and uses its algorithm) to delegate tasks to stateless executors, which include Python (in-process), Lithops, Modal, and Apache Beam.
Cubed uses external runtimes for computation, delegating tasks to stateless executors, which include Python (in-process), Lithops, Modal, and Apache Beam.


## Primitive operations
Expand Down Expand Up @@ -45,8 +45,7 @@ These are built on top of the primitive operations, and provide functions that a
elemwise
map_blocks
map_direct
index
map_selection
reduction
arg_reduction
```
Expand Down
2 changes: 1 addition & 1 deletion docs/images/design.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/images/map_selection.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 7 additions & 10 deletions docs/images/ops.dot
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,20 @@ digraph {
// core
elemwise [style="filled"; fillcolor="#ffd8b1";];
map_blocks [style="filled"; fillcolor="#ffd8b1";];
map_direct [style="filled"; fillcolor="#ffd8b1";];
map_selection [style="filled"; fillcolor="#ffd8b1";];
reduction [style="filled"; fillcolor="#ffd8b1";];
arg_reduction [style="filled"; fillcolor="#ffd8b1";];

elemwise -> blockwise;
map_blocks -> blockwise;
map_direct -> map_blocks;
map_selection -> blockwise;
reduction -> blockwise;
reduction -> rechunk;
arg_reduction -> reduction;

// array API

// array object
__getitem__ -> map_direct
__getitem__ -> map_selection

// elementwise
add -> elemwise
Expand All @@ -34,12 +33,11 @@ digraph {
// linear algebra
matmul -> blockwise;
matmul -> reduction;
outer -> blockwise;

// manipulation
concat -> map_direct;
concat -> blockwise;
reshape -> rechunk;
reshape -> map_direct;
reshape -> blockwise;
squeeze -> map_blocks;

// searching
Expand All @@ -51,18 +49,17 @@ digraph {
// utility
all -> reduction;


{
rank = min;

// fix horizontal placing with invisible edges
edge[style=invis];
add -> negative -> outer -> matmul -> __getitem__ -> concat -> reshape -> squeeze -> argmax -> sum -> all;
add -> negative -> squeeze -> __getitem__ -> concat -> matmul -> sum -> all -> argmax -> reshape;
rankdir = LR;
}
{
rank = same;
elemwise; map_blocks; reduction;
elemwise; map_blocks; map_selection; reduction;
}
{
rank = max;
Expand Down
Loading

0 comments on commit b3396a9

Please sign in to comment.