-
-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix off-by-one in NextFreeFileDescriptor #202
Comments
Hm yes good point. That explains the comment... The bug was that other libraries in the process were using file descriptors. So this fix was wrong. But I fixed it another way by not using those libraries ( |
And that was the "hardest bug" I mentioned in the appendix here: http://www.oilshell.org/blog/2018/10/11.html I already hit that bug twice! |
If you can add a test case (or put one here) that repros the filedescriptor bug, it will make it easier to fix |
I don't think this issue is observable right now. If nobody else is opening file descriptors in the process, there's no conflict. But I'll just leave this open since it's probably a good cleanup. |
It might not get fixed properly without a repro or way of checking that it's fixed, even if this line of code is fixed. It's easy enough to change this seemingly wrong line, but unknown if that was causing issues in the first place |
Oh man I'm in the middle of debugging a problem here and the root cause was in this code. The observable behavior was that somehow stdout was closed when running OSH unit tests under OSH, but not when running OSH unit tests under bash! I think there are probably other conditions where it could trigger, but OSH inheriting its own file descriptors seems to be one of them. I have made the bug go away by fixing this line, but it also pointed to me the fact that I'm using the syscalls somewhat wrong. If you look at Good eye on this one! In retrospect, it may or may not have made sense to fix earlier... it would have saved some time, but I probably wouldn't have understood the syscall issue as deeply. Sometimes you just have to bang your head into things to understand ... |
The symtom was that running 'test/unit.sh all' under OSH failed. In core/completion_test.py, we run the line: echo "$@" >&2 under test_lib.EvalCode(), and the two OSH interpreters conflicted somehow, resulting in stdout being permanently closed. Then any remaining print() calls failed with EBADF. I got rid of the self.next_fd counter since it's not correct. Instead we do a linear search from fd #10. Now the tests pass. However now I realize I should be using fnctl(..., F_DUPFD, ...) and looking at the return value. It does the linear search for you in the kernel! Looking at dash source code clued me into this. Addresses issue #202.
I'm glad something came of this!
actually, i'm unsure of the above. if dup2 is returning the error, that would mean that the call to dup2 is what reserves the FD slot. |
If there were two threads doing that at the same time, it would definitely lead to a race condition. But almost all shells are single-threaded, since they date back to the time before threads! There is one area where I might want to use threads (to make completion responsive against slow user plugins), so that is something to watch out for. |
closing in favor of #223 |
https://github.com/oilshell/oil/blob/master/core/process.py#L54
If i'm understanding this code correctly, the intended behavior is: loop over file descriptors until we find one that throws an error when we try to access it. when we find it, we return that filedescriptor.
The actual behavior: return that filedescriptor+1
This could come up as a problem if we load 2 file descriptors then close the first one. the impl would find the first one as available and then try to dup over the 2nd one.
Feel free to close if not an issue
The text was updated successfully, but these errors were encountered: