-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TimeoutLock less HWP PMX agent #604
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Kyohei,
Looks good, but there are a few things off about the task structure here that need to be fixed. Can you fix them, and then I will help to resolve any test errors that still exist? Thanks!
socs/agents/hwp_pmx/agent.py
Outdated
if self.shutdown_mode: | ||
return False, "Shutdown mode is in effect" | ||
|
||
self.dev.turn_on() | ||
action = Actions.SetOn(**params) | ||
self.action_queue.put(action) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Kyohei, in the lockless agents, all tasks should be structured something like this:
@defer.inlineCallbacks
def set_on(self, session, params):
action = Actions.SetOn(**params)
self.action_queue.put(action)
session.data = yield action.deferred
return True, f"Action {action} finished successfully"
I'd move the shutdown-mode check to either the _process_actions
function, or to the action's process
function, in which case it should be passed in along with the module.
They also need to be registered with blocking=False
, like
agent.register_task('set_on', PMX.set_on, blocking=False)
socs/agents/hwp_pmx/agent.py
Outdated
try: | ||
PMX = pmx.PMX(ip=self.ip, port=self.port) | ||
self.log.info('Connected to PMX Kikusui') | ||
except BrokenPipeError: | ||
self.log.error('Could not establish connection to PMX Kikusui') | ||
reactor.callFromThread(reactor.stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can test the hardware and failure modes, I'd personally opt for something a bit more robust than this. The LS240 restructure branch has an example of how we can structure this such that on connection failures, it will continue to try making the connection instead of just killing the reactor thread:
socs/socs/agents/lakeshore240/agent.py
Lines 203 to 223 in adf12ff
module: Optional[Module] = None | |
session.set_status('running') | |
# Clear pre-existing actions | |
while not self.action_queue.empty(): | |
action = self.action_queue.get() | |
action.deferred.errback(Exception("Action cancelled")) | |
pm = Pacemaker(self.f_sample, quantize=False) | |
while session.status in ['starting', 'running']: | |
if module is None: | |
try: | |
module = self._init_lakeshore() | |
except ConnectionRefusedError: | |
self.log.error( | |
"Could not connect to Lakeshore. " | |
"Retrying after 30 sec..." | |
) | |
time.sleep(30) | |
pm.sleep() | |
continue |
I'm looking at the test failures here, and it seems like the PMX integration tests rely strongly on the old agent structure, and I don't personally see a clear way to modify them to work with the new structure. I think we should remove them for the time being, and maybe add in tests later that use the HWP Emulator. @BrianJKoopman what do you think? |
@jlashner I made requested changes, and confirmed that these are working in UTokyo setup. Please review. |
looking good to me! I made one small change (moved the publish / setting of session data into the try-block for the _get_data function), and fixed integration tests so that they can run for me. Hopefully tests pass, and then we can merge! |
Tests pass!! and integration tests were very informative in catching a race condition. I think we can merge now. |
socs/agents/hwp_pmx/agent.py
Outdated
'block_name': 'hwppmx', | ||
'data': {} | ||
} | ||
self._clear_queue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlashner This needs to be put in the reactor too ... I suggest blockingCallFromThread.
* draft of new hwp_pmx * debug in UTokyo setup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * flake8 * int-test * fix tasks * fix shutdown mode * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more robust connection to PMX * fix typo * fix doc * Fix tests and other small changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * flake8 * Use callFromThread on deferred * clear queue in blockingCallFromThread --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jack Lashner <jlashner@gmail.com>
* draft of new hwp_pmx * debug in UTokyo setup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * flake8 * int-test * fix tasks * fix shutdown mode * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more robust connection to PMX * fix typo * fix doc * Fix tests and other small changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * flake8 * Use callFromThread on deferred * clear queue in blockingCallFromThread --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jack Lashner <jlashner@gmail.com>
Description
TimeoutLockless HWP PMX agent
Motivation and Context
In the operation of PMX at the site, we have many timeout lock errors. To avoid this error and make PMX control more robust, we modify the PMX agent in the same way as HWP PCU agent #600
I plan to do the same modification for the HWP PID agent as well.
The way I define actions may not be good. Suggestions for better coding are welcome.
How Has This Been Tested?
Tested in PMX in UTokyo setup
Types of changes
Checklist: