-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dtbmv
in libopenblas_haswellp-r0.2.18.dylib
on macOS Sierra (10.12.3) fails?
#1089
Comments
valgrind on Linux KabyLake (Haswell kernel) says
but then suggests to rerun with a stacksize bigger than the 8388608 it had used.
|
Thank you for your rapid reply. How does stack size affect this problem? |
I am not sure yet, I just wanted to put that observation here in case it gives an idea. Maybe it just serves to move "other things" out of the way so that whatever collateral damage gets done by dscal_kernel_8_zero() goes unnoticed. I see now that the same function is/was also implicated in #730 (though only perceived as a performance bottleneck).That issue thread is a somewhat frustrating read, but what you could do is |
Ok, thank you for your reply. According to |
Just for the record, replacing dscal_kernel_8_zero with its C equivalent from the proposed patch in #730 does not fix this - still crashing on what appears to be the first write to x in this function. Also fails on "Nehalem" class hardware, but works when run with at most two threads (on a dualcore machine at least). |
On the Nehalem system at least, the problem is reproducible with snapshots corresponding to versions down to 0.2.9 at least (making it likely that this problem is inherited from libgoto2). |
Why a read-only argument is being modified? What result should be returned when purpoted input gets sprayed with partial result as computation goes ahead? |
@brada4 mind being less aggressive but more verbose ? Where do you see input getting overwritten, do you think the test case is wrong or did you find an error in the implementation of dtbmv ? |
Just that A gets rewriten with result X, so that input nibbles as BLAS processes it. |
I see nothing wrong with the example, and in fact introducing a separate array q for the X parameter of dtbmv does not change anything. (In any case what you are suggesting would not lead to the segmentation fault that this issue is about) |
weird heap corruption every 5-6 runs:
and at other occurences:
|
@hiro4bbh can you retest? Does not fail anymore for me (You just need to add +1 in one file and re-run last 'make', just make sure you build & run OpenBLAS on same MAC) |
@brada4 unfortunately at least the behaviour under valgrind is unchanged with your patch |
I ran it hundred times, should have crashed few times at least. |
Seems it is the funky bitmask operation logic in the blas_quickdivide branch at the end of tbmv_thread.c that is broken - given the chance, it will happily produce a range_n argument for the last batch that exceeds the actual n of the range it was tasked to divide among the threads. So for the original poster's testcase I get matrix elements 1048569 to 2097168 when the last is actually 2097137, and even the builtin test cases have a maximum range_n of 80 where the data only goes to 63. A simple |
Mine also did not try to be correct, just pad structure to not owerflow... |
Had to revert my patch for now as it introduces accuracy errors - I still believe the fundamental idea behind it is essentially correct, but I may have introduced a range limit too many or in the wrong place. |
Just for the record the corrected patch for this went in as #1262 in august, so this is believed to be long fixed on the develop branch. |
dtbmv
inlibopenblas_haswellp-r0.2.18.dylib
on macOS Sierra (10.12.3) may fail on large scale as the following code.I have only MacBook running macOS Sierra, so I have no memory debugger like
valgrind
.However, I quote the console output running with
lldb
.This library is installed with
brew install openblas --build-from-source
.The text was updated successfully, but these errors were encountered: