Heuristically/Optionally use sparse data structures for adjacency matrix aggregation

## Idea

The adjacency matrix aggregation currently has a default limit of 100 filters (`index.max_adjacency_matrix_filters`). This is a sensible value considering the quadratic memory complexity of the operation. For some use cases this is pretty restricting. Of course in these cases it is possible to increase this default level, but the memory pressure quickly becomes unfeasible.

I talked with Mark Harwood and we came to the conclusion that this limit could probably be increased if sparse data structures were used for storing intermediate results in cases of large matrices (> 100 filters), because the memory usage wouldn't grow as fast as it would with dense data structures. This would probably result in a space/time trade-off, but allow for new use cases.

## API proposal

It's an option to not expose this in the API and apply a heuristic when to use which implementation (e.g. < 100 filters uses dense data structures, > 100 filters uses sparse data structures). The advantage of this is approach is that the user doesn't have to make a decision, but it could result in unexpected performance cliffs.

An alternative to that is to add a separate parameter to the request object (e.g. `useSparseMatrix`) to leave the decision to the user. In that case there should be a separate setting, e.g. `index.max_sparse_adjacency_matrix_filters`.

A cleaner way would be have a completely new aggregation `sparse_adjacency_matrix` to emphasize the distinction even more.

cc @markharwood

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Heuristically/Optionally use sparse data structures for adjacency matrix aggregation #46212

Idea

API proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Heuristically/Optionally use sparse data structures for adjacency matrix aggregation #46212

Description

Idea

API proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions