You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the architecture of benchexec in the localexecution.py module is such that it starts a number of worker threads depending on --numOfThreads and each of the worker threads repeatedly executes runs using RunExecutor. In container mode, execution of a run involves calling clone() to create a subprocess.
With a high (3-digit) number of threads we have not seen the expected throughput. While no profiling has been performed yet it is plausible that this is caused by the fact that all the pre- and postprocessing of runs (e.g., log analysis, writing results) is performed in Python threads, of which only one can be active at the same time due to the GIL.
(Note that all these problems do not affect users of runexec / containerexec, where the run execution is started from a single-threaded process.)
So in the long term it would probably be good to change this architecture. There at least two potential solutions:
Switch from worker threads to worker processes just like the multiprocessing module. Each worker process would be single threaded.
Have one designated (single-threaded) subprocess that is created in the beginning and whose sole responsibility is to spawn all further subprocesses on request. (Android uses this and calls it the Zygote process.)
Instead of clone() one can also use unshare() and os.fork() for creating a container, which should be safer, but due to the way how unshare() works with PID namespaces this would involve yet another process per run and probably complicate process handling even more than any of the other alternatives.
Things to consider:
Whether and how this affects and works for cases where benchexec is not called as a command-line tool, but executed as part of a larger Python program (that may have created threads before benchexec is even loaded).
The subprocess that we start for each run needs to be cloned from a process that already has all the required modules loaded, because this process is inside the container for the run and might not have access to the Python interpreter's files on disk.
How communication is possible with the process that hosts the tool-info module if more than one worker process needs to communicate, or if each worker process should also get its own separate process with an instance of the tool-info module.
The fact that preprocessing, actual run execution, and postprocessing is serialized within each worker thread and there is no overlap (i.e., the next run is not already being executed while a previous run is postprocessed) is by design. Otherwise we would have to reserve some cores for the postprocessing threads, which would lead to asymmetric core assignments. However, the fact that postprocessings of parallel threads compete for the GIL is not desired.
The text was updated successfully, but these errors were encountered:
Currently, the architecture of
benchexec
in thelocalexecution.py
module is such that it starts a number of worker threads depending on--numOfThreads
and each of the worker threads repeatedly executes runs usingRunExecutor
. In container mode, execution of a run involves callingclone()
to create a subprocess.This architecture has several problems:
clone()
is not safely usable in processes with more than one thread and can produce deadlocks (BenchExec subprocess hangs in __malloc_fork_lock_parent #656). We currently have a workaround for this, but it is not a full solution.(Note that all these problems do not affect users of
runexec
/containerexec
, where the run execution is started from a single-threaded process.)So in the long term it would probably be good to change this architecture. There at least two potential solutions:
Instead of
clone()
one can also useunshare()
andos.fork()
for creating a container, which should be safer, but due to the way howunshare()
works with PID namespaces this would involve yet another process per run and probably complicate process handling even more than any of the other alternatives.Things to consider:
benchexec
is not called as a command-line tool, but executed as part of a larger Python program (that may have created threads beforebenchexec
is even loaded).The text was updated successfully, but these errors were encountered: