Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS nrm1 (aka asum) returns a different value than TPLs for complex input #914

Closed
brian-kelley opened this issue Mar 18, 2021 · 1 comment
Assignees

Comments

@brian-kelley
Copy link
Contributor

brian-kelley commented Mar 18, 2021

Our KokkosBlas::nrm1 (and wrapper KokkosBlas::asum) return the sum of magnitudes of the elements:

sum(sqrt(real_i * real_i + imag_i * imag_i)) for i = 1...n

but BLAS TPLs (MKL, CUBLAS, netlib) return this the sum of absolute real and imaginary parts:

sum(abs(real_i) + abs(imag_i)) for i = 1...n

There is no difference between the two for real inputs (because imag_i is 0 for all i). I will fix this in our implementation.

@brian-kelley brian-kelley self-assigned this Mar 18, 2021
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this issue Mar 19, 2021
- Made nrm1 compute the sum of all absolute real and imaginary parts
  to match BLAS/MKL/CUBLAS behavior, rather than sum of magnitudes.
- Improved unit test coverage
  - verify each output element, not just dotprod of output with itself
  - for complex, create randomized inputs with nonzero imaginary parts
  - enable conj-trans mode testing for gemv
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this issue Mar 19, 2021
- Made nrm1 compute the sum of all absolute real and imaginary parts
  to match BLAS/MKL/CUBLAS behavior, rather than sum of magnitudes.
- Improved unit test coverage
  - verify each output element, not just dotprod of output with itself
  - for complex, create randomized inputs with nonzero imaginary parts
  - enable conj-trans mode testing for gemv
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this issue Mar 19, 2021
- Made nrm1 compute the sum of all absolute real and imaginary parts
  to match BLAS/MKL/CUBLAS behavior, rather than sum of magnitudes.
- Improved unit test coverage
  - verify each output element, not just dotprod of output with itself
  - for complex, create randomized inputs with nonzero imaginary parts
  - enable conj-trans mode testing for gemv
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this issue Mar 19, 2021
- Made nrm1 compute the sum of all absolute real and imaginary parts
  to match BLAS/MKL/CUBLAS behavior, rather than sum of magnitudes.
- Improved unit test coverage
  - verify each output element, not just dotprod of output with itself
  - for complex, create randomized inputs with nonzero imaginary parts
  - enable conj-trans mode testing for gemv
brian-kelley added a commit that referenced this issue Mar 22, 2021
Fixed nrm1 (#914), removed cublas nrminf, improved blas tests
@brian-kelley
Copy link
Contributor Author

Fixed with #915.

lucbv pushed a commit to lucbv/kokkos-kernels that referenced this issue May 10, 2021
- Made nrm1 compute the sum of all absolute real and imaginary parts
  to match BLAS/MKL/CUBLAS behavior, rather than sum of magnitudes.
- Improved unit test coverage
  - verify each output element, not just dotprod of output with itself
  - for complex, create randomized inputs with nonzero imaginary parts
  - enable conj-trans mode testing for gemv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants