-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure with 4.4.0-rc5 on arm #159
Comments
Full logs: https://kojipkgs.fedoraproject.org//work/tasks/6563/11926563/build.log
|
Thanks for the bug report. I'm curious how I'll debug this; I've got a raspberry pi in the office I can try on if physical hardware is required. Or perhaps simply cross-compiling for 32 bit arm. I'll see what I can sort out. It will be about a week before I can address it in detail as I'll be out of the office next week traveling; also, I need to split my time preparing for AGU. Thanks again for the bug report! |
I have access to a machine if you have any pointers for debugging this. |
So with test_put.c:3604:
The problem appears to be that allInExtRange is 0 but err = 0 rather than NC_ERANGE? Adding some more debug output:
|
And a little more for those failures:
So inRange3() is failing? |
Okay, I think I found the underlying cause here - http://blog.cdleary.com/2012/11/arm-chars-are-unsigned-by-default/ |
So I think this is because the test incorrectly assumes you can convert schar to char, which doesn't work if char == unsigned char. |
Sorry for the silence this week, I've been getting caught up on the week off and also preparing for AGU. That's a good catch, thanks for your help with debugging and sorting out the issue! I've got ARM hardware at my disposal (in the form of a Raspberry PI), I just need to dust it off and get it up and running so that I can implement a fix for this. Thanks again for the reference and your investigation; it wouldn't have been the first thing I considered when trying to diagnose this :). |
Ok, I got the raspberry pi up and running, got dependencies installed so that I could run the tests and fix this issue and... I'm unable to duplicate the problem, naturally. I'm working on a raspberry pi 2 model B, where the CPU is a 32-bit 900MHz quad-core ARM Cortex-A7 CPU. I'm working in the latest I have an older raspberry pi (using an ARM A6, I believe?) that I will try for the sake of thoroughness, but it should probably wait a bit until I at least get a draft of my AGU talk finished. @opoplawski can you provide some information regarding the environment where you're seeing this issue? I don't doubt your diagnosis, I just don't want to try to fix it without being able to check my work. |
I withdraw my previous comment, but I didn't want to silently delete it. I have duplicated the issue. |
great. Just for the record my machine is an ARMv7 Calxeda Highbank processor (armv7l) running Fedora 23 with gcc 5.1.1 20150618 (Red Hat 5.1.1-4) |
Ok, I have a fix in place for cmake-based builds; we check during configuration time whether char is signed or unsigned. I just need to wire a similar check into the autoconf build and I'll merge the changes into master, hopefully early next week. |
Ok, the fix I made was not comprehensive so, as always, work continues. |
This issue is proving more troublesome than expected; there is an element of it which is related to chars being unsigned by default, however other issues arise when this is fixed. You can see this for yourself by setting In a nutshell, the cdf5 test file created by So, debugging this issue continues but may be put on the back burner for a little while to catch up on other issues that are accruing. |
There were a number of issues in We are still seeing failures (e.g. expected -128, got 0, etc) when running the tests on ARM. These failures are happening even when I force signed chars via |
The root of the issue has been identified, finally, and a fix has been tested. The issue was in ncx_putn_uchar_double(void **xpp, size_t nelems, const double *tp)
{
int status = ENOERR;
uchar *xp = (uchar *) *xpp;
while(nelems-- != 0)
{
if(*tp > X_UCHAR_MAX || *tp < 0)
status = NC_ERANGE;
*xp++ = (uchar)*tp++; //<- THIS IS THE PROBLEM.
}
*xpp = (void *)xp;
return status;
} The specific issue is that casting a negative
The fix for this is to explicitly cast the source data type to *xpp++ = (uchar)(signed)*tp++; This results in the proper value being preserved. TODOTo fix this issue, I need to clean up my working branch, remove stray files, clean up the tests I've added, etc. Once done and I'm convinced this works and has not introduced additional issues, I will merge it into master and close this issue out. |
…ow.com/questions/10541200/is-the-behaviour-of-casting-a-negative-double-to-unsigned-int-defined-in-the-c-s for some background info that lead me to the fix.
… overly broad; I will refine it once I've had a chance to read up on m4.
Nice catch! |
Thank you. In testing the fix I've uncovered a couple further issues that weren't cropping up before, but once again they seem tied to the |
After several more trips through the *"Fix bug/run tests/identify new bug" cycle, it appears that everything is now working properly across various platforms. This may be premature as I still have to test Windows, but that aside, I'm happy with the fix. The issues turned out to be a combination of:
The fix is a mix of
The latter issue turned out to much more difficult to track down, as many of the assignments where this happened weren't obvious; they were usually of the form I'm going through and cleaning up a lot of the detritus from the debugging process, and then I will document the fix/issue fully, as well as providing information re: expected behavior, etc. Once this is finished I will finally address the stack of other issues that have opened up, starting with pull request @dmh recently opened and that should be pretty straightforward. I will close out this issue when the fix is merged into @opoplawski : I only have access to the Raspberry Pi mentioned above; if this fix does not address the issue on your platform, please let me know. |
Closing issue; resolved in merge #180 unless I find out otherwise. |
I'm afraid we're back to having problems with 4.4.0 on arm. First off, it looks like libsrc/ncx.c in the 4.4.0 tarball was not generated with the latest ncx.m4 source, as it's missing the #ifdef arm conditionals around the ncx_*n_text() functions. After fixing that I get:
|
No need to capture my heavy sigh in the issue notes, so I'll simply go fix this. It will be in the upcoming 4.4.1, which will be a maintenance release in the short-term future for this and other small things that have cropped up. |
In the meantime you should be able to regenerate the ncx.c manually and have it work. |
Actually, I'm going to close this back out, as the fix is in the release process not the code itself. I've updated my (internal) release documentation to emphasize ensuring that the latest generated files are present. |
I'm seeing a test failure on the Fedora arm builder:
FAIL: nc_test
re-running now to try to get more information...
The text was updated successfully, but these errors were encountered: