-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is this a threading issue? #56
Comments
Ahhh this old chestnut :-) So - me and Luke lost a day to this also, it turns out that attempting to write directly to os.Std{out,err} from a go-routine does very strange things with locking the process - we came across this when trying to close devstack after a test was running Fixed this issue by removing It would be nice to know exactly why directly writing to os.Std{err,out} causes these deadlocks but let's bump that to a "deep dive to understand go concurrent stream writing" issue |
So - it turns out that there is a more fundamental problem here: golang/go#23019 In summary:
This will cause the deadlock because the grand-child is still thinking it's connected to the stdout/stderr pipes The solution it seems is to always write to temporary files instead of os.Std{out,err} - this does not trigger the blocking code path.... So - let's update |
This is how the Android team worked around the problem: https://go-review.googlesource.com/c/go/+/42271/3/misc/android/go_android_exec.go We are doing something similar. |
Fixed with this - a60f080 |
* update the docs home page * add images
* typo fixes * minor fixes * update examples via bacalhau-project/examples@ed2a0e2
When I execute the following command:
The first (and only the first) test passes - though non-deterministically - e.g. could be any of the three randomly chosen.
Here's the log output:
I SUSPECT this is something about after the job finishing, the system closes down the nodes.
For example, midway through the log, you can see:
Which is confirmed here - https://github.com/filecoin-project/bacalhau/blob/reformatting_logs_for_multi_threading/internal/compute_node.go#L176
However, by the time it gets to here: https://github.com/filecoin-project/bacalhau/blob/d0aef41ca20deb14741a614eced28d92e3ed0f78/internal/runtime/runtime.go#L245
The directory is gone. A bit stumped - my suspicion:
(I think the last one is the most likely, since it's the newest)
vm
is being run.The text was updated successfully, but these errors were encountered: