Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quality and sequence mismatch #10

Closed
masen1991 opened this issue Jun 22, 2021 · 9 comments
Closed

quality and sequence mismatch #10

masen1991 opened this issue Jun 22, 2021 · 9 comments

Comments

@masen1991
Copy link

hell @eldariont
do u have any opinion on this bug?
SVIM_210622_110108.log

Thanks

@jakob-he
Copy link
Collaborator

Hi @masen0407,

Thanks a lot for approaching us about this!
The error occurred due to an incorrect setting of quality scores but should be fixed with the latest commit.

Best,
Jakob

@eldariont
Copy link
Owner

Hi @masen0407,

one other thing I noticed in your log file is the name of your input BAM file: PacBio_CCS_15kb/HG002.pb.15kb.minimap2.hs37d5.sorted.bam

Just to make sure: Is this a genome-genome alignment or an alignment of reads? In the second case, you should use SVIM instead of svim-asm.

Cheers,
David

@yekaizhou
Copy link

Hi, is this bug fixed? I encountered with the same error.

@jakob-he
Copy link
Collaborator

Hi @yekaizhou,

the bug should be fixed with the lasted commit. However, It is not part of the current bioconda release.
So for it to take effect you have to update SVIM-asm via git clone and pip (second install option in the Readme).

Best,
Jakob

@yekaizhou
Copy link

Hi @yekaizhou,

the bug should be fixed with the lasted commit. However, It is not part of the current bioconda release.
So for it to take effect you have to update SVIM-asm via git clone and pip (second install option in the Readme).

Best,
Jakob

Hi Jakob,

Thanks for your help.

However, it seems bioconda have the latest version (1.0.2) same as github here, and the github version that I have also tried still have the same error.

I wish to get a phased SV call set from read alignments. Therefore, seems SVIM-asm can do this functionality while SVIM is not able. I am wondering if the asm version can process read alignments the same as SVIM, but can work for SV phasing? Is it not able to do it (as the error shows), or it can generate a phased SV call set but maybe not very accurate?

Best,
Yekai

@jakob-he
Copy link
Collaborator

Hi Yekai,

the release version is the same on bioconda and github but the release doesn't include the latest commit.
It sounds like you have tried it with the current github repository, so the error is likely caused by a different issue.
Would you mind uploading the full error message or is exactly the same as in the original issue?

Generally, calling phased SVs from read alignments using SVIM-asm is unfortunately not possible.
SVIM-asm is only able to call phased SVs if you provide haplotype assemblies.
This is due to the identification of SV candidates which assumes the input is an assembly represented by a "single read".
Unfortunately, I am also not aware of any SV-caller that has this functionality for an unphased set of reads.
In most cases, the reads are first phased using SNV calls and then each haplotype is assembled separately and compared to the reference genome.

Best,
Jakob

@yekaizhou
Copy link

Hi Jakob,

Sorry for my unclear description. I actually fed SVIM-asm with phased reads generated by SNV calling and WhatsHap read haplotagging. The read depth of my data is low so that haplotype assembling is not very satisfying. Therefore I am trying if SVs can be called and phased directly from the phased reads.

Thanks a lot for your help!
Yekai

@mtva0001
Copy link

Hi!

We have the same issue using the latest version (1.0.2):

(svim-asm) b-an01 [/proj/nobackup/snic2022-6-27/Kesava/Mutation]$ svim-asm haploid . Week29PCG1_sorted.bam W0barcode5consensus.FASTA.fasta
2022-09-16 15:35:18,765 [INFO ] ****************** Start SVIM-asm, version 1.0.2 ******************
2022-09-16 15:35:18,768 [INFO ] CMD: python3 /pfs/stor10/users/home/k/kesava03/Public/svim-asm/bin/svim-asm haploid . Week29PCG1_sorted.bam W0barcode5consensus.FASTA.fasta
2022-09-16 15:35:18,768 [INFO ] WORKING DIR: /pfs/proj/nobackup/fs/projnb10/snic2022-6-27/Kesava/Mutation
2022-09-16 15:35:18,768 [INFO ] PARAMETER: sub, VALUE: haploid
2022-09-16 15:35:18,768 [INFO ] PARAMETER: working_dir, VALUE: /pfs/proj/nobackup/fs/projnb10/snic2022-6-27/Kesava/Mutation
2022-09-16 15:35:18,768 [INFO ] PARAMETER: bam_file, VALUE: Week29PCG1_sorted.bam
2022-09-16 15:35:18,768 [INFO ] PARAMETER: genome, VALUE: W0barcode5consensus.FASTA.fasta
2022-09-16 15:35:18,768 [INFO ] PARAMETER: verbose, VALUE: False
2022-09-16 15:35:18,768 [INFO ] PARAMETER: min_mapq, VALUE: 20
2022-09-16 15:35:18,768 [INFO ] PARAMETER: min_sv_size, VALUE: 40
2022-09-16 15:35:18,768 [INFO ] PARAMETER: max_sv_size, VALUE: 100000
2022-09-16 15:35:18,768 [INFO ] PARAMETER: query_gap_tolerance, VALUE: 50
2022-09-16 15:35:18,768 [INFO ] PARAMETER: query_overlap_tolerance, VALUE: 50
2022-09-16 15:35:18,769 [INFO ] PARAMETER: reference_gap_tolerance, VALUE: 50
2022-09-16 15:35:18,769 [INFO ] PARAMETER: reference_overlap_tolerance, VALUE: 50
2022-09-16 15:35:18,769 [INFO ] PARAMETER: sample, VALUE: Sample
2022-09-16 15:35:18,769 [INFO ] PARAMETER: types, VALUE: DEL,INS,INV,DUP:TANDEM,DUP:INT,BND
2022-09-16 15:35:18,769 [INFO ] PARAMETER: symbolic_alleles, VALUE: False
2022-09-16 15:35:18,769 [INFO ] PARAMETER: tandem_duplications_as_insertions, VALUE: False
2022-09-16 15:35:18,769 [INFO ] PARAMETER: interspersed_duplications_as_insertions, VALUE: False
2022-09-16 15:35:18,769 [INFO ] PARAMETER: query_names, VALUE: False
2022-09-16 15:35:18,769 [INFO ] ****************** STEP 1: COLLECT ******************
2022-09-16 15:35:18,769 [INFO ] MODE: haploid
2022-09-16 15:35:18,769 [INFO ] INPUT: /pfs/proj/nobackup/fs/projnb10/snic2022-6-27/Kesava/Mutation/Week29PCG1_sorted.bam
2022-09-16 15:35:18,838 [INFO ] Processing chromosome utg000001l...
2022-09-16 15:35:18,866 [ERROR ] quality and sequence mismatch: 16427 != 0
Traceback (most recent call last):
File "/pfs/stor10/users/home/k/kesava03/Public/svim-asm/bin/svim-asm", line 183, in
sys.exit(main())
File "/pfs/stor10/users/home/k/kesava03/Public/svim-asm/bin/svim-asm", line 74, in main
sv_candidates = analyze_alignment_file_coordsorted(aln_file1, options)
File "/pfs/stor10/users/home/k/kesava03/Public/svim-asm/lib/python3.8/site-packages/svim_asm/SVIM_COLLECT.py", line 72, in analyze_alignment_file_coordsorted
supplementary_alignments = retrieve_other_alignments(current_alignment, bam)
File "/pfs/stor10/users/home/k/kesava03/Public/svim-asm/lib/python3.8/site-packages/svim_asm/SVIM_COLLECT.py", line 50, in retrieve_other_alignments
a.query_qualities = main_alignment.query_qualities
File "pysam/libcalignedsegment.pyx", line 1514, in pysam.libcalignedsegment.AlignedSegment.query_qualities.set
ValueError: quality and sequence mismatch: 16427 != 0

The fasta file is a genome assembly.

@eldariont
Copy link
Owner

Hi mtva0001,

as Jakob wrote above, the latest release (v1.0.2) did not include the bug fix until today. Just now, I created a new release (v1.0.3) and uploaded it to pypi (bioconda following soon). Could you please use this version and report back whether it fixes your problem?

Best
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants