-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-batch: support Snakemake --report
#373
aws-batch: support Snakemake --report
#373
Comments
Hmm. Downloading the Snakemake state locally may fix this problem, but it can/will cause other problems. I don't know if it'd be ok if we scope it down to not all of What's the effect of the warnings? Is there useful information missing from the report? Or is just wanting to suppress the noise from Snakemake? |
The generated report does not include any runtime info: |
Ah, looking more closely at the contents of |
We should do this, per above. I'll open a PR.
We could also do this as well, but it requires a little more consideration about how/where/when. Would you open it as a separate issue if you'd like to see it? |
This will also need a new docker-base image, as the same exclusions of |
Snakemake stores state information per-input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not upload all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Related-to: <nextstrain/cli#373>
Snakemake stores state information per input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not download all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Resolves: <#373> Related-to: <nextstrain/docker-base#220>
Snakemake stores state information per input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not upload all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Resolves: <nextstrain/cli#373> Related-to: <nextstrain/cli#374>
Snakemake stores state information per input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not download all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Resolves: <#373> Related-to: <nextstrain/docker-base#220>
Snakemake stores state information per input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not download all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Resolves: <#373> Related-to: <nextstrain/docker-base#220>
Hmm, maybe this doesn't need to be built into the Nextsrain CLI. It could just be a separate step in the |
Totally. |
Snakemake stores state information per input/output here and uses it to determine if it needs to re-run rules or not. It seems akin to the file mtimes which we already take care to preserve on upload/download. Additionally, the metadata recorded is used in Snakemake's report generation and is generally useful for looking at workflow statistics. Continue to not download all of .snakemake/ en masse because it can potentially contain files that interfere with local usage and/or are large and unnecessary. Resolves: <#373> Related-to: <nextstrain/docker-base#220>
Context
Snakemake has removed the
--stats
option in v8, so I'm looking into the--report
option for long term workflow stats.The Snakemake report must be generated after the workflow has finished. I thought this would be as simple as attaching/downloading an old AWS Batch job then running
nextstrain build . --report
.When I did this for ncov-ingest, I saw a bunch of warnings along the lines of:
I then realized we are explicitly excluding Snakemake state in the downloads from AWS Batch:
cli/nextstrain/cli/runner/aws_batch/s3.py
Lines 113 to 124 in 8ed779c
Possible solutions
.snakemake/metadata
in the downloads from AWS Batch so that users can generate the Snakemake report locally.[2] definitely seems like the nicer option and maybe should be applied across all runtimes for
nextstrain build
?The text was updated successfully, but these errors were encountered: