-
-
Notifications
You must be signed in to change notification settings - Fork 21k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running builder (content generator) functions in subprocesses on Windows #17595
Running builder (content generator) functions in subprocesses on Windows #17595
Conversation
CC @garyo for review. |
compat.py
Outdated
@@ -62,3 +70,60 @@ def escape_string(s): | |||
result += chr(c) | |||
return result | |||
|
|||
|
|||
def run_in_subprocess(builder_function): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the compat
module was meant only for Python 2/3 compatibility definitions. This should maybe go in methods
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of platform compatibility. Maybe a new module would work best. Something like plat_compat.py
or so. Could you please suggest a module name? Then I can separate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, methods.py
already has some platform-specific methods. Maybe you could make a new platform_methods.py
module and move the relevant methods.py
methods there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but that would be unrelated to the scope of this PR and the ticket behind.
To make it clear I suggest adding platform_methods.py
with the new functions introduced here. Then we can move the rest of functions between the modules in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me :)
Can you fix the conflicts? |
Notes on resolving conflicts with two previously merged PRs:
|
I was directed to try this patch out.
|
Actually, this might not invalidate this patch, it's just that it needs to be completed to also handle the generation of the shader headers in |
And basically all |
Rebased branch to latest master. Enrolled all remaining source generators to the subprocess based build mechanism to prevent random build errors. Some generated files are written directly in an SConstruct or SCsub file, before the parallel build starts. They don't need to run in a subprocess, apparently, so I left them untouched. Benchmark |
I can confirm that this seems to fix the issue for me. I tested this on two different Windows machines where this fails on |
This has a conflict that needs to be fixed, then I'd be strongly in favor of merging this one. |
- Refactored all builder (make_*) functions into separate Python modules along to the build tree - Introduced utility function to wrap all invocations on Windows, but does not change it elsewhere - Introduced stub to use the builders module as a stand alone script and invoke a selected function There is a problem with file handles related to writing generated content (*.gen.h and *.gen.cpp) on Windows, which randomly causes a SHARING VIOLATION error to the compiler resulting in flaky builds. Running all such content generators in a new subprocess instead of directly inside the build script works around the issue. Yes, I tried the multiprocessing module. It did not work due to conflict with SCons on cPickle. Suggested workaround did not fully work either. Using the run_in_subprocess wrapper on osx and x11 platforms as well for consistency. In case of running a cross-compilation on Windows they would still be used, but likely it will not happen in practice. What counts is that the build itself is running on which platform, not the target platform. Some generated files are written directly in an SConstruct or SCsub file, before the parallel build starts. They don't need to be written in a subprocess, apparently, so I left them as is.
This PR is moving most of the generator code from Could somebody else please also verify that no code were missed in the process? |
Awesome work, thanks a ton @viktor-ferenczi! |
Thank you. It was a pleasure to help your awesome project. |
Trying to build master after this commit causes this https://puu.sh/B4B6M/2243083766.txt Works on Python 2.7 |
Rather than moving your python generator functions such that they can be run separately and then running them via a builder which calls subprocess, wouldn't it be simpler to just have your builders call those python scripts via normal builders? Since the issue (if I understand correctly) was that you had blobs of python logic in your builders in the same python process and GIL issues, and you've resolve that by making the generators independent scripts. So your Action would now be "python <generator_script.py> "? This would run the python scripts in a separate process |
@moiman100 Type basestring is not available on Python 3.7, so the issue. I'm going to add a new PR to fix Python 3.5+ compatiblity. Also, build with Python 3.7 is not tested by CI and I did not do it either (my bad). As a result we did not catch this problem on time. |
@viktor-ferenczi @moiman100 It's fixed by #20544. |
Perfect, thanks for the update and sorry for the mistake. |
I'm getting a build error on OS X when building with
|
Added ticket #20613 It is because the Possible solutions:
I think 2. would be simpler. It needs a simple platform condition to the |
Fix is in PR #20617 |
PR #20617 passed all CI checks and tested on Mac by @marcelofg55 |
PR #20617 completed, so the Mac build error caused by this PR has been fixed. |
You probably got confirmation already, considering this is merged into master, but I thought I'd follow up. Before this patch a parallel build on Windows would break fairly quickly, and would fail every time. I have just pulled master and tried a clean ( |
Background
There is a problem with file handles related to writing generated content (
*.gen.h
and*.gen.cpp
)on Windows, which randomly causes a
SHARING VIOLATION
error to the compiler resulting in flakybuilds. Running all such content generator in a new subprocess instead of directly inside the build
script works around the issue.
Q&A
The root cause of the issue is unknown, but it is specific to how file handles are somehow shared with child processes, even if the files are closed before spawning the child. Attempting to disable file handle sharing by setting the
HANDLE_FLAG_INHERIT
option to zero on all file handles involved did not work out.Yes, I tried the multiprocessing module. It did not work due to conflict with SCons on
cPickle
.Suggested workaround did not fully work either.
run_in_subprocess
is not used as a decorator on the builder functions?One reason is to make it explicit in the Builder definition that the function will run in a subprocess.
Other reason is that the original builder function need to be called in the subprocess, which would need saving that function on the wrapper and special casing it there to call the original function while not running inside the main build script. It would involve adding magic auto detection code, also one more symbol to import in the builder modules. I decided not doing that, despite the decorator syntax would look nicer. I can still implement this per request if you would like it better that way.
Using the
run_in_subprocess
wrapper onosx
andx11
platforms as well for consistency, but itdoes not actually create subprocesses there. In case of running a cross-compilation on
Windows they would still be used, but likely it will not happen in practice. What counts is
the platform the build itself is running on, not the target platform.
Fixes #5042