Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hyphens in sample names causing "snakemake invocation error" #187

Closed
hehouts opened this issue Apr 20, 2022 · 3 comments · Fixed by #188
Closed

hyphens in sample names causing "snakemake invocation error" #187

hehouts opened this issue Apr 20, 2022 · 3 comments · Fixed by #188

Comments

@hehouts
Copy link

hehouts commented Apr 20, 2022

I have a 32 virome-metagenomes (not downloaded from the sra) that have hyphens in the sample name (e.g. SM-6OSOR, SM-6UG79, SM-6WJN6) in /home/hehouts/dynamic-duos-virome/grist/Archive/outputs.s5_vir/abundtrim/.

I tried running grist with genome-grist run config_s5_vir.yml summarize_mapping -j8
with config_s5_vir.yml:

prevent_sra_download: true                                                                                                                    
samples:                                                                                                                                      
- SM-6UG79                                                                                                                                                                                                                                                                 
- SM-7I1G8                                                                                                                                    
- SM-7K25Z   
# ...ect  
                                                                                                                               
outdir: outputs.s5_vir                                                                                                                        
                                                                                                       
prefetch_memory: 20e9                                                                                                                                                                                                                               
                                                                                                                                              
sourmash_database_ksize: 21                                                                                                                   
sourmash_scaled: 200                                                                                                                          
sourmash_database_threshold_bp: 100                                                                                                           
                                                                                                                                              
sourmash_databases:                                                                                                                           
- /group/ctbrowngrp/scratch/ctbrown/gut-phage-db.grist/GPD_sequences.zip                                                                      
                                                                                                                                              
local_databases_info:                                                                                                                         
- /group/ctbrowngrp/scratch/ctbrown/gut-phage-db.grist/GPD_sequences.info.csv                                                                 
                                                                                                                                              
taxonomies:                                                                                                                                   
- /group/ctbrowngrp/genbank/all_genbank_lineages.20200727.csv    

its returning an error:

MissingInputException in line 410 of /home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/Snakefile:             
Missing input files for rule summarize_mapping:     
outputs.s5_vir/reports/report-mapping-SM-6OSOR.html                                                                                           
outputs.s5_vir/reports/report-mapping-SM-7IL1E.html                                                                                           
outputs.s5_vir/reports/report-gather-SM-7K25Z.html     
... ect ...
Error in snakemake invocation: Command '['snakemake', '-s', '/home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf
/Snakefile', '-j', '1', '--use-conda', 'summarize_mapping', '--rerun-incomplete', '-j8', '--configfile', '/home/hehouts/miniconda3/envs/grist/
lib/python3.9/site-packages/genome_grist/conf/defaults.conf', '/home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/co
nf/system.conf', 'config_s5_vir.yml']' returned non-zero exit status 1.

I think its weird that is I run genome-grist run config_s5_vir.yml smash_reads -n, I get a similar error:

MissingInputException in line 195 of /home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/Snakefile:
Missing input files for rule smash_reads:
outputs.s5_vir/sigs/SM-7I1G8.abundtrim.sig.gz
outputs.s5_vir/sigs/SM-7K25Z.abundtrim.sig.gz
...ect
Error in snakemake invocation: Command '['snakemake', '-s', '/home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/Snakefile', '-j', '1', '--use-conda', 'smash_reads', '-n', '--configfile', '/home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/defaults.conf', '/home/hehouts/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/system.conf', 'config_s5_vir.yml']' returned non-zero exit status 1.

but if I run genome-grist run config_s5_vir.yml trim_reads -n
it recognizes that the input files are there:

samples: ['SM-6UG79', 'SM-71WY3', 'SM-7CRX4', 'SM-6X9WV', 'SM-76CAU', 'SM-7CP3H', 'SM-7EWTT', 'SM-6OSOR', 'SM-6Y2V3', 'SM-77VQ4', 'SM-7EWUL', 'SM-6WJN6', 'SM-6X9X4', 'SM-6YAZO', 'SM-6ZKSM', 'SM-71WXM', 'SM-73JY4', 'SM-76EOJ', 'SM-791BX', 'SM-7BF2J', 'SM-7CRJ8', 'SM-7EWUH', 'SM-7GYJO', 'SM-7IL1E', 'SM-7KPUR', 'SM-7MOQJ', 'SM-6WOCE', 'SM-6ZEVI', 'SM-7AA2E', 'SM-7BP5L', 'SM-7I1G8', 'SM-7K25Z']
outdir: outputs.s5_vir
Building DAG of jobs...
Nothing to be done (all requested files are present and up to date).

I tried changing the sample names to use underscores instead of hyphens, in /home/hehouts/dynamic-duos-virome/grist/Archive/outputs.virtest/abundtrim & its config /home/hehouts/dynamic-duos-virome/grist/Archive/config_virtest.yml and it appears to be working!

So it seems like the hyphens are causing the problem.

@ctb
Copy link
Member

ctb commented Apr 20, 2022 via email

@ctb
Copy link
Member

ctb commented Apr 21, 2022

Results of my investigation match yours! Hyphens don't work, underscores do. Wat?

@ctb
Copy link
Member

ctb commented Apr 21, 2022

Figured it out: #188.

@ctb ctb closed this as completed in #188 Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants