-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ORCA-229] Implement monitor_workflow()
for Tower
#22
Conversation
Codecov Report
@@ Coverage Diff @@
## bgrande/ORCA-228/refactor-workflow-status #22 +/- ##
===========================================================================
Coverage 100.00% 100.00%
===========================================================================
Files 28 28
Lines 848 854 +6
Branches 133 134 +1
===========================================================================
+ Hits 848 854 +6
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 LGTM! going to pre-approve but just a couple of comments.
def __repr__(self) -> str: | ||
"""String representation of a workflow.""" | ||
return f"Workflow(run_name={self.run_name}, id={self.id}, state={self.state})" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on adding a __str__
representation too that is just the id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataclasses already have default implementations for __repr()__
and __str__()
. I'm just overriding this one because the full output is too long for logging, which I want both the human-friendly run name and computer-friendly ID to appear. I'm aware of the field(..., repr=False)
parameter, but I don't want to set that for all but a handful of attributes.
@@ -261,3 +250,25 @@ def get_latest_previous_workflow( | |||
# Otherwise, return latest based on submission timestamp | |||
sorted_runs = sorted(previous_runs, key=lambda x: x.get("submit")) | |||
return sorted_runs[-1] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this work with the airflow sensor? Or do you envision airflow just to call:
workflow = ...get_workflow()
Status = workflow.status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would look like this (original version):
@task.sensor(poke_interval=10, timeout=604800, mode="poke")
def monitor_workflow(params, workflow_id):
hook = NextflowTowerHook(context["params"]["conn_id"])
workflow = hook.ops.get_workflow(workflow_id)
return PokeReturnValue(workflow.is_done, workflow.state)
Or the class-based equivalent of the above.
This PR migrates the async function that I was using in ntap-add5-scripts to monitor workflow runs.
While working on this, I realized that the refactoring that I did in #21 eliminated the need for
get_workflow_status()
, so I removed this function.