Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable AWS spot retry #5215

Merged
merged 2 commits into from
Aug 12, 2024
Merged

Disable AWS spot retry #5215

merged 2 commits into from
Aug 12, 2024

Conversation

pditommaso
Copy link
Member

@pditommaso pditommaso commented Aug 9, 2024

This PR disables the automatic retry made by AWS Batch when a spot instance is reclaimed.

The main reasons to disable this capability is:

  • The same tasks can be re-tried multiple times incurring in significant spending increase with the user is a aware of that
  • The AWS automatic retry re-execute a task in the same working directory because it's not directly managed by nextflow. This can introduce nasty side effects with partial/corrupted data left in a previous execution
  • There's not log/visual feedback during the pipeline execution, because it's managed directly by AWS Batch.
  • Keep it aligned with same feature with Google Batch that's disabled by default.

User can still enable this capability by setting the following option:

aws.batch.maxSpotAttempts = n 

where n is a integer > 0

pditommaso and others added 2 commits August 9, 2024 14:56
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Copy link

netlify bot commented Aug 9, 2024

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 263280f
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/66b6495669c6b60008f2688e
😎 Deploy Preview https://deploy-preview-5215--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@bentsherman
Copy link
Member

Need to also update the docs for aws.batch.maxSpotAttempts with the new default value and a callout that the default value was changed.

@bentsherman bentsherman requested a review from a team as a code owner August 9, 2024 16:52
@bentsherman
Copy link
Member

Do you think we should do the same for google.batch.maxSpotAttempts?

@pditommaso
Copy link
Member Author

Good point, yes. For some reason I was convinced that google was defaulting to zero

@pditommaso
Copy link
Member Author

Opened #5223 to handle Google one

@pditommaso pditommaso merged commit f28fcb2 into master Aug 12, 2024
5 checks passed
@pditommaso pditommaso deleted the disable-aws-spot-retry branch August 12, 2024 16:16
pditommaso added a commit that referenced this pull request Sep 4, 2024
This commit disables the AWS Batch spot auto-retries. 

The main reasons to disable this capability is:

* The same tasks can be re-tried multiple times incurring in significant spending increase with the user is a aware of that
* The AWS automatic retry re-execute a task in the same working directory because it's not directly managed by nextflow. This can introduce nasty side effects with partial/corrupted data left in a previous execution
* There's not log/visual feedback during the pipeline execution, because it's managed directly by AWS Batch.

User can still enable this capability by setting the following option:

```
aws.batch.maxSpotAttempts = n 
```

where n is a integer > 0


Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Co-authored-by: Ben Sherman <bentshermann@gmail.com>
nschan pushed a commit to nschan/nextflow that referenced this pull request Sep 12, 2024
This commit disables the AWS Batch spot auto-retries.

The main reasons to disable this capability is:

* The same tasks can be re-tried multiple times incurring in significant spending increase with the user is a aware of that
* The AWS automatic retry re-execute a task in the same working directory because it's not directly managed by nextflow. This can introduce nasty side effects with partial/corrupted data left in a previous execution
* There's not log/visual feedback during the pipeline execution, because it's managed directly by AWS Batch.

User can still enable this capability by setting the following option:

```
aws.batch.maxSpotAttempts = n
```

where n is a integer > 0

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Co-authored-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Niklas Schandry <niklas@bio.lmu.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants