Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of --sharded #172

Open
Rridley7 opened this issue Jun 18, 2023 · 0 comments
Open

Usage of --sharded #172

Rridley7 opened this issue Jun 18, 2023 · 0 comments

Comments

@Rridley7
Copy link

Hi, thanks again for the development of this tool. I had a quick question in regards to the use of the --sharded command. I believe my reference fits the use case, where a single index would not fit into memory. My expectation was that if I pass multiple references to the -r option, the reads would be mapped to both references, then the best match between the two would be selected. So my current command is:

REF_FILE1=02_map_binned/side_test/bwa_idx/part1
REF_FILE2=02_map_binned/side_test/bwa_idx/part2
BAM_DIR=02_bam_files

TMPDIR=tmp_dir coverm contig -m mean -r $REF_FILE1 $REF_FILE2  \
--output-file test_shard.tsv  -p bwa-mem2 --sharded \
--min-read-percent-identity 0.95 --min-read-aligned-percent 0.95 -t 24 \
--bam-file-cache-directory $BAM_DIR  --no-zeros --single unbinned_nr_genes_00[345].ffn.gz

For which, I get the output:

[2023-06-18T04:20:00Z INFO  bird_tool_utils::clap_utils] CoverM version 0.6.1
[2023-06-18T04:20:00Z INFO  coverm] Using min-read-percent-identity 95%
[2023-06-18T04:20:00Z INFO  coverm] Using min-read-aligned-percent 95%
[2023-06-18T04:20:00Z INFO  coverm] Writing output to file: test_shard.tsv
[2023-06-18T04:20:00Z INFO  coverm] Using min-covered-fraction 0%
[2023-06-18T04:20:01Z INFO  bird_tool_utils::external_command_checker] Found bwa-mem2 version 2.2.1
[2023-06-18T04:20:01Z INFO  bird_tool_utils::external_command_checker] Found samtools version 1.16.1
[2023-06-18T04:20:01Z INFO  coverm] Writing BAM files to already existing directory 02_bam_files
[2023-06-18T04:20:01Z INFO  coverm::mapping_index_maintenance] BWA index appears to be complete, so going ahead and using it.
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_003.ffn.gz.bam
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_004.ffn.gz.bam
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_005.ffn.gz.bam
[2023-06-18T04:20:01Z INFO  coverm::mapping_index_maintenance] BWA index appears to be complete, so going ahead and using it.
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_003.ffn.gz.bam
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_004.ffn.gz.bam
[2023-06-18T04:20:01Z INFO  coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_005.ffn.gz.bam
[2023-06-18T04:20:30Z INFO  coverm::contig] In sample 'part1/unbinned_nr_genes_003.ffn.gz', found 4666 reads mapped out of 567290 total (0.82%)
[2023-06-18T04:20:57Z INFO  coverm::contig] In sample 'part1/unbinned_nr_genes_004.ffn.gz', found 4967 reads mapped out of 567387 total (0.88%)
[2023-06-18T04:21:26Z INFO  coverm::contig] In sample 'part1/unbinned_nr_genes_005.ffn.gz', found 4801 reads mapped out of 567706 total (0.85%)
[2023-06-18T04:21:54Z ERROR coverm::coverage_takers] Found a difference amongst the reference sets used for mapping. For this (non-streaming) usage of CoverM, all BAM files must have the same set of reference sequences. Previous entry was contig-196890-spa-t-S25_9i6072_2, new is contig-27673-spa-t-S57_2b3234_5

What is the proper usage of coverm in this case, with a 2 part index?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant