Skip to content

Unroll sum and / or dot manually so that it autovectorizes #280

@bluss

Description

@bluss

Edit: Oops! This lib doesn't have arbitrary length vectors, only nalgebra has that. But they are not as responsive..

See this pattern for how to write a float reduce loop so that it can autovectorize:

Implementation in ndarray: unroll_sum

Same pattern for dot product: user forum post

It's not super pretty, and it hardcodes the number of lanes used, but it's still much faster for floats than the naive sum since llvm can not vectorize that. (rust-lang/rust/issues/21690 would be needed).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions