Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow 1 restart per task #82

Merged
merged 4 commits into from
Aug 28, 2024
Merged

Allow 1 restart per task #82

merged 4 commits into from
Aug 28, 2024

Conversation

matthdsm
Copy link
Collaborator

This should fix a concurrency issue with the CSI driver
cfr ceph/ceph-csi#3511 and hashicorp/nomad#15197

This should fix a concurrency issue with the CSI driver
ceph/ceph-csi#3511
hashicorp/nomad#15197
@abhi18av
Copy link
Member

abhi18av commented Aug 28, 2024

@matthdsm , would this trigger a nomad-native retry or is this expecting Nextflow to trigger this?

As of now, we have a model of 1 nextflow task -> 1 nomad job -> group -> task and therefore, error at nomad job level reflects error at nextflow level and the overall behaviour aligns with what Nextflow would expect.

How nomad-native retries would interact with the overall setup needs to be tested.

@abhi18av abhi18av marked this pull request as draft August 28, 2024 13:48
@jagedn
Copy link
Collaborator

jagedn commented Aug 28, 2024

If I'm not wrong, before to have this constants we're using the default value (3) and Nextflow wait correctly for the completion of a failed job, so maybe can be a good idea to use 1 as default

@abhi18av
Copy link
Member

abhi18av commented Aug 28, 2024

If I'm not wrong, before to have this constants we're using the default value (3) and Nextflow wait correctly for the completion of a failed job, so maybe can be a good idea to use 1 as default

It seems that 3 is the default for these attempts #82 (comment)

@jagedn , do you think its worth exposing the closure and including other fields like delay , interval etc?

@abhi18av abhi18av marked this pull request as ready for review August 28, 2024 16:01
@abhi18av
Copy link
Member

Okay, merging this and tagging this realease as 0.2.0-edge3 to release a build

@abhi18av abhi18av merged commit e38b31f into master Aug 28, 2024
1 check passed
@abhi18av abhi18av deleted the fix/task_retry branch August 28, 2024 16:04
@jagedn
Copy link
Collaborator

jagedn commented Aug 28, 2024

@jagedn , do you think its worth exposing the closure and including other fields like delay , interval etc?

maybe we can implement all of them when required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants