Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrangler in sif #54

Merged
merged 12 commits into from
Jan 16, 2024
2 changes: 1 addition & 1 deletion snakemake/wrangler_by_sample_setup.smk
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ rule generate_mip_files:
'''
input:
arms_file='/opt/project_resources/mip_ids/mip_arms.txt',
sample_sheet='/opt/input_sample_sheet_directory/'+config['input_sample_sheet_name'],
sample_sheet='/opt/input_sample_sheet_directory/'+config['input_sample_sheet'].split('/')[-1],
fastq_folder='/opt/data'
params:
sample_set=config['sample_set_used'],
Expand Down
13 changes: 13 additions & 0 deletions user_scripts/wrangler_by_sample.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,20 @@ function parse_yaml {
}'
}


eval $(parse_yaml wrangler_by_sample.yaml)

function parse_sample_sheet_directory {
readarray -d "/" -t strarr <<< "$input_sample_sheet"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cool Unix function for splitting up a string into an array. I'll have to try and remember this, thanks for sharing it

for (( n=1; n < ${#strarr[*]}-1; n++))
do
input_sample_sheet_directory+="/${strarr[n]}"
done
echo $input_sample_sheet_directory
}
input_sample_sheet_directory=$(parse_sample_sheet_directory)


############################
# setup the run
##########################
Expand All @@ -43,6 +55,7 @@ singularity_bindings="-B $project_resources:/opt/project_resources
-B $output_folder:/opt/analysis
-B $input_sample_sheet_directory:/opt/input_sample_sheet_directory
-B $fastq_dir:/opt/data
-B /home/charlie/projects/MIPTools_wrangler_in_sif/snakemake:/opt/snakemake
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping this path isn't hardcoded into the final shell script. Will look for a future commit that removes it

-H $newhome"

snakemake_args="--cores $cpu_count --keep-going --rerun-incomplete --latency-wait 60"
Expand Down
8 changes: 3 additions & 5 deletions user_scripts/wrangler_by_sample.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ downsample_seed: 312
#how many CPUs (or threads) to use in parallel - you can set this relatively low
#(e.g. 10) because most of the intensive steps of this program are parallelized
#across samples (i.e. 1000 parallel processes if you have 1,000 samples).
cpu_count: 5
cpu_count: 16

#This applies to the most memory intensive steps of the pipeline. Lower values
#complete faster on a cluster but may crash out for bigger samples or highly
Expand All @@ -24,9 +24,7 @@ cpu_count: 5
memory_mb_per_step: 20000

#location of sample sheet
input_sample_sheet_directory: /home/charlie/projects/miptools_data/miptools_test-data/test_data

input_sample_sheet_name: sample_list.tsv
input_sample_sheet: /home/charlie/projects/miptools_data/miptools_test-data/test_data/sample_list.tsv
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice fix - decreases the number of variables a user has to enter into the yaml file


#location of project resources
project_resources: /home/charlie/projects/miptools_data/miptools_test-data/DR1_project_resources
Expand All @@ -35,7 +33,7 @@ project_resources: /home/charlie/projects/miptools_data/miptools_test-data/DR1_p
fastq_dir: /home/charlie/projects/miptools_data/miptools_test-data/test_data/fastq

#location of sif file to use
miptools_sif: /home/charlie/Downloads/MIPTools/wrangler_in_sif.sif
miptools_sif: /home/charlie/projects/miptools_data/wrangler_in_sif.sif

#only rows from the sample sheet that contain exact matches to the probe sets
#listed here (after splitting the probe_set column with commas) will be analyzed
Expand Down