Don't issue redundant stdin detection warning when is in place. #1303
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch fixes a race happening between 2 IO bound operations:
read(2)
withsys.stdin
select(2)
withsys.stdin
select(2)
is a function that takes files and tells when they are ready to be processed (read/written etc.). It is used for concurrently dealing with multiple sockets (e.g you have multiple open connections, and withselect(2)
you can check which one of those connections have some data to be processed). In HTTPie however, we used it to detect whether there is any incoming data to thesys.stdin
, so that when there is no data we could issue a warning.The problem is that, the initial version of this code unknowlingly created a race between the
select.select
call:https://github.com/httpie/httpie/blob/30cd862fc0e173698fc17487c4b96d8f64b701ea/httpie/uploads.py#L98-L110
and the
file.read()
call:https://github.com/httpie/httpie/blob/30cd862fc0e173698fc17487c4b96d8f64b701ea/httpie/uploads.py#L124-L132
Both of these functions are blocking, and we ran them concurrently to get away from that issue. But since Python has a GIL, when you run multiple threads together, depending on the CPU-related activity and some other details it switches between them on certain intervals. (green arrows indicate the interpreter given the execution to the Main Thread, red indicate it is given to Observer Thread)
So to put it simply, we used to run the
read()
and theselect()
at the same time and whichever gets choosen to be executed at that time would win. If it wasselect()
, we'd get the correct result (about whether there is any data or not) and if it wasread()
we'd always get the wrong result (no incomiong data). This caused to intermittent / random failure. From what I understood this problem happens randomly (and in a OS-dependant way, because of the underlying thread scheduler), and there is no fair way of testing it (except manually).The race has been solved by moving the
select(2)
out of the thread (as a guard) and usingthreading.Event
s as indicators about whether we've seen any data or not.