Automatically clean up AWS Batch temporary folder #1450

sminot · 2020-01-09T17:02:15Z

New feature

When using the awsbatch executor, there is no reason to keep files in the temporary directory after a process has finished execution. I suggest that Nextflow automatically delete all of the files in the ephemeral temporary directory after execution.

The accumulation of large files in the temporary directory of AWS Batch workers can cause big problems when they fill up the partition, effectively blocking an entire workflow. While it is possible to write a workflow which fixes this with afterScript "rm -r *", such a solution is entirely incompatible with any executor which uses a shared filesystem (local, SLURM, etc.).

Usage scenario

The main usage case is a user who is using the awsbatch executor. The desired scenario is that when they run a workflow, the usage partition used for scratch space will be kept to a minimum, storing only those files which are being used by actively running tasks. The current scenario is that long-running workflows will accumulate files in the scratch partition, eventually filling up and completely locking up those workers.

For the developer, this improvement would also mean that I can take out the afterScript "rm -r *" command in my processes, which will make them easily compatible with local execution modes, and which will also protect me against running out of space in the scratch partition on AWS Batch.

Suggest implementation

I regret to say that I do not understand the Nextflow codebase enough to suggest the most efficient implementation of this idea.

The text was updated successfully, but these errors were encountered:

pditommaso · 2020-01-13T14:31:02Z

This looks a duplicate of #452, for which there's no a quick solution tho the plan is to tackle in a more general manner at some point.

Have you taken into consideration using a S3 lifecycle policy to cleanup the bucket? it works beautifully

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html

sminot · 2020-01-13T14:34:53Z

My apologies, I was hoping to explain how this is a really different issue and concern.

There is a group of files staged to the working instance in an ephemeral scratch directory which is distinct from the S3 bucket used for the working directory. That's the set of files I want to target in this issue.

Does that help explain?

sminot · 2020-01-13T14:42:52Z

If it were possible to write this code in the process, it would constitute a reasonable solution:

if executor == 'awsbatch':
    afterScript "rm -r *"

Note that this would not impact any files in the work directory

pditommaso · 2020-01-13T14:45:47Z

I see. But that's already done, if you look at the command launcher you will see

on_exit() {
    exit_status=${nxf_main_ret:=$?}
    printf $exit_status | /home/ec2-user/miniconda/bin/aws --region eu-west-1 s3 cp --only-show-errors - s3://nf-course/work/ad/84c59e22b4b0d4dd038b35e9885a05/.exitcode || true
    set +u
    [[ "$tee1" ]] && kill $tee1 2>/dev/null
    [[ "$tee2" ]] && kill $tee2 2>/dev/null
    [[ "$ctmp" ]] && rm -rf $ctmp || true
    rm -rf $NXF_SCRATCH || true
    exit $exit_status
}

sminot · 2020-01-13T14:50:19Z

Oh, that's great to know! Does it only get invoked on job success? Or whenever a process finished for any reason? Get Outlook for Android<https://aka.ms/ghei36>

…

________________________________ From: Paolo Di Tommaso <notifications@github.com> Sent: Monday, January 13, 2020 6:45:47 AM To: nextflow-io/nextflow <nextflow@noreply.github.com> Cc: Minot, Sam <sminot@fredhutch.org>; Author <author@noreply.github.com> Subject: Re: [nextflow-io/nextflow] Automatically clean up AWS Batch temporary folder (#1450) I see. But that's already done, if you look at the command launcher you will see on_exit() { exit_status=${nxf_main_ret:=$?} printf $exit_status | /home/ec2-user/miniconda/bin/aws --region eu-west-1 s3 cp --only-show-errors - s3://nf-course/work/ad/84c59e22b4b0d4dd038b35e9885a05/.exitcode || true set +u [[ "$tee1" ]] && kill $tee1 2>/dev/null [[ "$tee2" ]] && kill $tee2 2>/dev/null [[ "$ctmp" ]] && rm -rf $ctmp || true rm -rf $NXF_SCRATCH || true exit $exit_status } — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_nextflow-2Dio_nextflow_issues_1450-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DABHZKSH3FBIOAZ54WFVIIQLQ5R5BXA5CNFSM4KE3RELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIY624A-23issuecomment-2D573697392&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=43yq3NlvxZAkeuLlWR4RGR24qPDLFwQzc_wHlXeny0I&m=zo7ybsUfrZqLT0hQnxBxQPGSz5KY72elkhIUvxl8c4s&s=3obfigXwceyJqdiFP7M0EN2cs-ZzUT1Qeczwv-mAkRM&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABHZKSH4N44QIJYNAIVK5WDQ5R5BXANCNFSM4KE3RELA&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=43yq3NlvxZAkeuLlWR4RGR24qPDLFwQzc_wHlXeny0I&m=zo7ybsUfrZqLT0hQnxBxQPGSz5KY72elkhIUvxl8c4s&s=VGu0PSa-oo7mpQumYzwsrkeRsevBkJtBdD6tM9OMbMI&e=>.

pditommaso · 2020-01-13T14:51:05Z

it should be invoked in all cases.

sminot · 2020-01-13T14:58:00Z

Ok, in that case I'm all set!

Thank you for your quick and helpful response!

sminot closed this as completed Jan 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically clean up AWS Batch temporary folder #1450

Automatically clean up AWS Batch temporary folder #1450

sminot commented Jan 9, 2020

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020

sminot commented Jan 13, 2020

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020 via email

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020

Automatically clean up AWS Batch temporary folder #1450

Automatically clean up AWS Batch temporary folder #1450

Comments

sminot commented Jan 9, 2020

New feature

Usage scenario

Suggest implementation

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020

sminot commented Jan 13, 2020

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020 via email

pditommaso commented Jan 13, 2020

sminot commented Jan 13, 2020