You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read the codes of axpy_sse.S around line 801.
To use movaps aligned accessing instruction, OpenBLAS must load 4 SP numbers to 128-bit register in line 801 which may past the end. However, I think it doesn't use the dirty number.
it may be fine for axpy but dot uses horizontal operations, I'm not so sure about them not being used (e.g. via NaN * 0 = NaN)
quite likely that this is the cause of gh-189
I am getting a lot of Valgrind errors with OpenBLAS right now. They appear to relate to reading past the end of allocated memory, e.g.:
==18705== Invalid read of size 8
==18705== at 0x52754C4: saxpy_k (axpy_sse.S:801)
This is reading past the end of a block which is not a multiple of 8,, e.g.
==18705== Address 0x7768a48 is 5,096 bytes inside a block of size 5,100 alloc'd
Is it expected that I should allocate memory in multiples of 8 or 16 bytes? I do already align it to 16 byte boundaries.
Also in
==18705== at 0x5276FC1: sdot_k (dot_sse.S:735)
==18705== at 0x5275FE8: scopy_k (copy_sse.S:411)
I am on x86_64 architecture.
The text was updated successfully, but these errors were encountered: