Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: add Vector::Add, Sub and ScalarMul assembly (and purego) implementations #536

Merged
merged 9 commits into from
Sep 12, 2024

Conversation

gbotrel
Copy link
Collaborator

@gbotrel gbotrel commented Sep 9, 2024

Description

Adds

// Add adds two vectors element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) Add(a, b Vector)

// Sub subtracts two vectors element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) Sub(a, b Vector)

// ScalarMul multiplies a vector by a scalar element-wise and stores the result in self.
// It panics if the vectors don't have the same length.
func (vector *Vector) ScalarMul(a Vector, b *Element)

Assembly (amd64 target only) is generated for modulus < 256 bits.

Benchmarks on r7i.xlarge of x86 assembly vs pure go path: (perf gain are less interesting on on some AMD chips like hpc6a instances)

benchmark                              old ns/op     new ns/op     delta
BenchmarkElementVecOps/Add-4           2779          1762          -36.60%
BenchmarkElementVecOps/Add-4           2787          1768          -36.56%
BenchmarkElementVecOps/Sub-4           2767          1706          -38.34%
BenchmarkElementVecOps/Sub-4           2779          1715          -38.29%
BenchmarkElementVecOps/ScalarMul-4     18014         13515         -24.98%
BenchmarkElementVecOps/ScalarMul-4     18030         13533         -24.94%

Good starting point against something using AVX like this.

Copy link
Collaborator

@AlexandreBelling AlexandreBelling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but do you have the branch on top of v0.10.1-0.20240904184047-9db0eff0e5d3

@gbotrel gbotrel merged commit df40d22 into master Sep 12, 2024
6 checks passed
@gbotrel gbotrel deleted the experiment/vecops branch September 12, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants