-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moves pysmurf-controller functions that load lots of data to subprocess #669
Conversation
for more information, see https://pre-commit.ci
return threads.blockingCallFromThread( | ||
reactor, _run_func_in_subprocess_reactor, cfg) | ||
else: | ||
return _run_func_in_subprocess_reactor(cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool... I believe that if you are in reactor context, then you can't block -- i.e you need to return a Deferred here, which would only be possible if _run_func_in_subprocess_reactor(cfg)
returned prot.deferred
or some wrapped thing that applies the Runresult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point... I changed to run_func_in_subprocess_from_thread
and had it only handle the from-thread case
for more information, see https://pre-commit.ci
This has been tested on satp1 and it works! It seems to have fixed the large memory leak. Notes from my testing is here: https://simonsobs.atlassian.net/wiki/spaces/~55705801070cc775254537ab15f2dcf6702b78/pages/412778522/2024-05-02+pysmurf-controller+subprocessing+tests Notably, memory returns to base-level after IVs and bias steps, which it did not previously: I want to note that there does still seem to be a memory leak caused by the functions that have not been moved to subprocesses. I did not move the I think we should probably move all logic involving pysmurf to a subprocess, however I this this PR solves the most pressing issues and largest memory leaks, so I think we should merge this first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad the subprocess route is working! Can you address the unused imports caught by pre-commit: https://results.pre-commit.ci/run/github/186511668/1714677193.715HK9dWRnKVTIpCyOTUcg
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, a few small comments/questions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments look good, merge conflicts still need resolving though.
This PR changes pysmurf-controller operations that load TODs such that instead of running in the main process, they call out to a subprocess.
Description
This introduces the
smurf_subprocess_util
module, and a protocol for running functions inside a twisted subprocess in such a way that a general RunCfg can be passed to it, and a general RunResult can be returned. Any operation that involves TOD data loading is modified so that the loading and analysis occurs in a subprocess rather than in the main one.Motivation and Context
This should hopefully solve the memory leak, which I believe is coming from instances where TOD data is not being released to the system after it leaves scope. Moving TOD loading and analysis into subprocesses instead of the main one should enforce that this memory is returned to the OS when the subprocess exits. This discussion has some links that motivate this solution.
How Has This Been Tested?
I have tested the subprocess protocol locally through the use of the
test
process (and the newrun_test_func
operation).I will need to test all other operations with hardware on SATp1 or SATp3.
Types of changes
Checklist: