-
-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Effective (and fast) S09 threading #1462
Effective (and fast) S09 threading #1462
Conversation
holy shit ... I will test this ASAP |
nice catch ...
Probably we have similar issues in other areas ... |
Indeed, this issue may occur in other modules as well, but it might not be as obvious if the spawned processes take longer to execute than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this beautiful bug. Could you please check my little comment? Afterwards we can merge it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this fix and improvement
"Improvement" or "bug fix", depending how we see it.
With
THREADED=1
, generatingstrings_bins
in S09 is very slow, and get slower and slower as the number of generated files increases (gets slower than single-thread at some point)Cause: one process is spawned per file to generate under
strings_bins
, and all their PIDs are kept in a list and not removed until all strings_bins are generated. Once the PIDs lists is longer than${MAX_MOD_THREADS}*3
, themax_pids_protection
kicks in to make sure there aren't too many processes at once. Problem is, as the PIDs list gets longer,max_pids_protection
takes longer to execute than the last spawned process to extract strings (mostly because of checking so many dead PIDs I guess). Since a new process is not spawned untilmax_pids_protection
returns, what we have is a "main process" doing nothing but checking dead PIDs and short "job processes" running only once in a while.There is also a potential issue (not observed for sure yet) with PID recycling over time. See other information below.
String generation is now "really" multi-threaded and much faster, especially on large firmwares. To do so, the PID list is cleaned up from dead PID at every
MAX_MOD_THREADS
spawned process (in tests, most often all processes were dead and cleaned up, and sometimes there was 1 still running).To do so, existing logic in S09 (in the second part of S09) was adapted and reused.
Here's an anecdotical benchmark performed on a firmware that I can't share.
strings_bins
Generating strings took 4 hours (almost exactly) before fix, and under 6 minutes after the fix. This is only for the string generation part. The remainder of S09 took about 20 minutes to execute in both cases (did not expect any change).
Also, looking at file access times in
strings_bins
folder, I can roughly compute the average throughput (how much files created per seconds) at various points of S09 execution:Of course, extracted strings in
strings_bins
were the same before and after the fix.No.
Cleaning up the list of running PIDs in S09 not only speeds up things, but helps to prevent an issue with recycled PIDs.
When S09, S25 and S115 are running at once, and even when S09 is completed, I noticed on my system (and the above test firmware) that there is about 500,000 processes created per hour. At this rate, PIDs are recycled about every 8 hours (
pid_max
in the docker container on my system is 4 million) . After the fix, S09 doesn't run for very long anymore, but it used to before the fix, and others modules such as S25 and S115 (and others that come after) can still run for a very long time.Keeping lists of spawned (and dead) PIDs for several hours is thus getting in the "danger zone" of monitoring (or even killing) unrelated processes. I haven't observed this yet, but on a very large firmware this could happen. As such, there might be other instances, in other modules, where it might be safer to tidy up PIDs list as it's now done in S09.