-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrangler in sif #54
Wrangler in sif #54
Conversation
@@ -1,6 +1,6 @@ | |||
configfile: 'variant_calling.yaml' | |||
output_folder='/opt/analysis' | |||
log_folder=config['output_directory']+'/run_settings' | |||
log_folder='/opt/analysis/run_settings' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense - since snakemake is now inside the sif, run settings are now getting sent to a sif file internal location
user_scripts/variant_calling.sh
Outdated
@@ -43,6 +43,7 @@ singularity_bindings="-B $project_resources:/opt/project_resources | |||
-B $species_resources:/opt/species_resources | |||
-B $wrangler_directory:/opt/data | |||
-B $output_directory:/opt/analysis | |||
-B /home/charlie/projects/MIPTools_wrangler_in_sif/snakemake:/opt/snakemake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This location probably shouldn't be hard coded. Maybe it gets changed in a later commit?
@@ -1,3 +1,9 @@ | |||
######################################################## | |||
# README | |||
# this file uses variant_calling.yaml for its parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping variant_calling and check_run_stats with the same yaml file makes sense - variant_calling depends on the outputs of check_run_stats to only perform variant calling on the haplotypes that mapped to correct locations of genome
@@ -0,0 +1,61 @@ | |||
################################################# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this is a subset of the variant_calling.sh script (only doing the stats part, for people who don't want to commit to a week and a half of variant calling but still want to get the basic run stats that variant calling performs at the beginning)
user_scripts/wrangler_by_sample.yaml
Outdated
input_sample_sheet_directory: /home/charlie/projects/miptools_data/miptools_test-data/test_data | ||
|
||
input_sample_sheet_name: sample_list.tsv | ||
input_sample_sheet: /home/charlie/projects/miptools_data/miptools_test-data/test_data/sample_list.tsv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice fix - decreases the number of variables a user has to enter into the yaml file
user_scripts/wrangler_by_sample.sh
Outdated
@@ -43,6 +55,7 @@ singularity_bindings="-B $project_resources:/opt/project_resources | |||
-B $output_folder:/opt/analysis | |||
-B $input_sample_sheet_directory:/opt/input_sample_sheet_directory | |||
-B $fastq_dir:/opt/data | |||
-B /home/charlie/projects/MIPTools_wrangler_in_sif/snakemake:/opt/snakemake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping this path isn't hardcoded into the final shell script. Will look for a future commit that removes it
eval $(parse_yaml wrangler_by_sample.yaml) | ||
|
||
function parse_sample_sheet_directory { | ||
readarray -d "/" -t strarr <<< "$input_sample_sheet" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cool Unix function for splitting up a string into an array. I'll have to try and remember this, thanks for sharing it
user_scripts/wrangler_by_sample.sh
Outdated
@@ -43,7 +43,6 @@ singularity_bindings="-B $project_resources:/opt/project_resources | |||
-B $output_folder:/opt/analysis | |||
-B $input_sample_sheet_directory:/opt/input_sample_sheet_directory | |||
-B $fastq_dir:/opt/data | |||
-B /home/charlie/projects/MIPTools_wrangler_in_sif/snakemake:/opt/snakemake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent - I see this is where the hardcoded shell script bindings are getting taken back out. I also get why you added them in the first place now - allowed you to test changes to snakemake without having to build a new sif
|
||
rule copy_files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following why the rule "copy_files" is not needed anymore. I think you (Charlie) explained this to me but I can't remember it. Is it because the setup run snakemake script already takes care of the copying?
setup_snakefile='setup_run.smk', | ||
finish_snakefile='finish_run.smk', | ||
input_configfile='wrangler_by_sample.yaml', | ||
in_scripts='scripts' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the rule 'extract_by_arm" not need any input, or just not need these old copy_files inputs? github parsed the changes to look like no input is needed
script for this. Input is an arms file and a sample sheet. Output is an arms | ||
file with rearranged columns and a two column file with names of all mips | ||
and names of all samples (with no pairing between columns of any given row). | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes - this rule probably isn't needed if it got moved to the setup_run.smk script
mem_mb=40000, | ||
time_min=1440, | ||
nodes=20 | ||
threads: 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where the copy_files rule got moved to, I see now. It looks like github swapped the file contents of the setup and finish_run steps somehow, and showed the changes that would convert one file into the other as if they had been made line by line in each file (instead of swapping out the entire contents of both files)
Incorporating Charlie's edits to put miptools wrangler snakemake in a sif file