-
Notifications
You must be signed in to change notification settings - Fork 877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add access() calls to ensure fopen(...,"r") doesn't segfault #4335
Conversation
Can one of the admins verify this patch? |
One of the invariants of the fopen() function called with the "r" option is that the file exist before the function is called. If this is not the case behavior is undefined (in the case of glibc on some Crays and Redhat machines this means you get a segfault in fileno_unlocked()). This commit is a first attempt at ensuring this invariant holds when calls to fopen() are made. Signed-off-by: Noah Evans <nevans@sandia.gov>
This code needs some pretty thorough review, both for style and to make sure I haven't forgotten anything. Also the calls to |
It's also worth noting that I've left hwloc alone. How is hwloc merged from its individual repo to its main tree? |
We generally take hwloc release tarballs. However, at the moment, we have an early copy of HWLOC v2.0 from their nightly tarball. Your best bet would be to fork the hwloc repo and generate a pull request against their master. |
BTW: are you sure about that existence requirement for fopen? I've been scanning the man pages, and I don't see that - indeed, there is a clear statement that the call will fail with ENOENT set in errno. Where are you seeing this requirement? |
@rhc54 You're right if gnu followed the posix standard. From fopen
The gnu documentation however makes this assumption:
This means that any time OpenMPI does an I've seen segfaults all over the place because of this. In particular when trying to open configuration or help files that don't exist and when specialized parts of the I'm certainly open to a better solution if you can think of one. In the meantime I'll fix whatever bugs pop up in the automated testing (it looks like, minimum, I broke an NFS test). |
Huh; I've never seen I wonder if we should do what we've done in a few other portability scenarios:
That way, we don't have to add calls to |
Me too - I've never heard of nor encountered this before, which is why I'm suspicious. I like the suggestion from @jsquyres - not only is it cleaner, but it means we only have one place that static code analyzers are going to complain about. Both Klockworks and Coverity will object that this PR creates a race condition between checking for file existence and then accessing it as there is no guarantee that the file hasn't been removed in the interim. This is why we studiously cleansed the code base of calls to access (or stat) prior to calling a file access function - we were getting peppered with defect reports, and some of our companies (ahem, mine) will not allow release of code that crosses certain defect density boundaries. I'm a little bothered that we are now going back to reinsert those calls - but if we do the wrapper, then it will only count as one defect, so we might get away with it. |
@rhc54 @jsquyres Here's a relevant crash if I move my mca-params.conf out of my home directory:
So fileno is trying to dereference null pointer, which I believe is the indirection on the int
__fileno (_IO_FILE *fp)
{
CHECK_FILE (fp, EOF);
if (!(fp->_flags & _IO_IS_FILEBUF) || _IO_fileno (fp) < 0)
{
__set_errno (EBADF);
return -1;
}
return _IO_fileno (fp);
} |
I'm compiling a personal glibc to see if I can figure out what the bug is exactly. |
Weird - understand, very few users even have a local mca-params.conf in their home directory. I never have had one myself. In 13+ years of production, I have never heard anyone report a segfault because the file didn't exist. As I said, maybe you have found some weird configuration that results in problems. We can certainly add the proposed protection (though I'd follow the suggestion from @jsquyres) - we just need to understand that it introduces a race condition that will generate (perfectly valid) complaints from the code checkers. Might even want to ad a configure option to turn this off to avoid the complaints - like I said, nobody has ever reported such a thing, so maybe it only needs to be there for some configurations. |
@rhc54 Let me dig a little deeper here. I'll close this pull request for the moment (You guys' solution is better anyway) and see if I can really understand what its problem is. I can use this code to test the environment code now which will let me make sure the earlier pull request works. |
One of the invariants of the fopen() function called with the "r"
option is that the file exist before the function is called. If
this is not the case behavior is undefined (in the case of glibc
on some Crays and Redhat machines this means you get a segfault
in fileno_unlocked()). This commit is a first attempt at ensuring
this invariant holds when calls to fopen() are made.