RFC: add `materialize` to materialize lazy arrays

### Preface

I do not think that I am the best person to champion this effort, as I am far from the most informed person here on Lazy arrays. I'm probably missing important things, but I would like to start this discussion as I think that it is an important topic.

### The problem

The problem of mixing computation requiring data-dependent properties with lazy execution is discussed in detail elsewhere:

- https://data-apis.org/array-api/draft/design_topics/lazy_eager.html 
- https://data-apis.org/array-api/draft/design_topics/data_dependent_output_shapes.html#data-dependent-output-shapes
- https://github.com/data-apis/array-api/issues/748
- https://github.com/data-apis/array-api/issues/834

### A possible solution

Add the function `materialize(x: Array)` to the top level of the API. Behaviour:
- for eagerly-executed arrays, this would be a no-op
- for lazy arrays, this would force computation such that the data is available in the returned array *(which is of the same array type?)*
- for "100% lazy" arrays (https://github.com/data-apis/array-api/issues/748#issuecomment-2189724261), this would raise an exception

### Prior art

- Dask:
	- https://docs.dask.org/en/stable/generated/dask.array.Array.compute_chunk_sizes.html computes chunk sizes / shape, working in-place and leaving the array as a Dask array.
	- https://docs.dask.org/en/stable/generated/dask.array.Array.compute.html materialises the in-memory equivalent of the dask array, returning e.g. a NumPy array.
- JAX:
	- ?
- others?

### Concerns

- I think the main concern is whether eager-only libraries will agree to adding a no-op into the API. There is precedent for that type of change (e.g. `device` kwargs in NumPy), but perhaps this is too obtrusive?
- As far as I can tell there isn't a standard way to do this across lazy libraries. Does JAX just do this automatically when it would be needed? Do other libraries have this capability?

### Alternatives

1. Do nothing. The easy option, but it leaves us unable to support lazy arrays when data-dependent properties are used in computation (maybe that is okay?)
2. An alternative API. Maybe spelled like `compute*` or a method on the array object. Maybe with options for partial materialization (if that's a thing)?

---

cc @tomnicholas @hameerabbasi @rgommers 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: add `materialize` to materialize lazy arrays #839

Preface

The problem

A possible solution

Prior art

Concerns

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: add materialize to materialize lazy arrays #839

Description

Preface

The problem

A possible solution

Prior art

Concerns

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

RFC: add `materialize` to materialize lazy arrays #839