just align (not global alignment) PE reads (without FLASH) to the amplicon sequence and quantify indel rate? #120

YichaoOU · 2021-08-05T17:30:58Z

Is your feature request related to a problem? Please describe.

I think it's quite common that FLASH can't really merge the PE reads, For example:

1. amplicon sequence too long or too short
1. read is noisy
  - need to trim reads because the 3'-end reads looks like AAAAAAAAAA or TTTTTTTTTTTTT
  - overall read quality is too low

Describe the solution you'd like
Just use BWA to align PE reads to the amplicon sequence and extract indel frequency?

Thanks,
Yichao

YichaoOU · 2021-08-05T17:33:03Z

BTW, I think crispresso can take bam as input, instead of fastq, but looking at the code, I think what the program really does is extracting the reads from the bam file and then do a global alignment again, right?

kclem · 2021-08-05T21:14:50Z

Hi Yichao,

Thanks for using CRISPResso -- we've been thinking about this as well, and recognize that flash is a potential source of errors.

You are correct that CRISPResso realigns every read even if provided a bam input. Our biologically-informed alignment is our special sauce that makes the quantification accurate, and we like to think it's a lot better than bwa or another aligner.

I think it's probably best to do trimming before CRISPResso or to use trimmomatic or the filtering within crispresso to get rid of the AAAAA or TTTTTT sequences as well as sequences that are the wrong length.

As we work on the flash-less alignment, do you have a case of reads and a reference that are particularly challenging? Just to make sure that there is nothing we are missing?

YichaoOU · 2021-08-05T21:20:23Z

Thanks for the quick reply! I don't have a very challenging dataset, I can just solve my problems by some data cleaning.

Looking forward to this FLASH-less version!

Thanks,
Yichao

eul2021 · 2021-09-23T17:31:18Z

Hi Kendell,
I have a ~400bp amplicon and ~210bp pair-end illumina amplicon sequencing. When I run CRISPResso on a unedited control ~100% of reads are filtered out. I tried various options from your past issues, but cannot find a solution to fix this. It's probably related to having long/unedited amplicons and reads merely overlapping. Is there are set of parameters to tweak FLASH or to omit running it all together? Thank you very much.

kclem · 2021-09-24T03:32:48Z

Try the parameter --force_merge_pairs. This is highly experimental and to be used with caution. R1 and R2 will be joined using Flash, and then if they cannot be joined, they will be pasted together. So a lot of apparent indels at the site of merging will be created.

eul2021 · 2021-09-28T17:53:10Z

Thank you, Kendell, I tried running with this parameter, but it returned error with the following:
"CRITICAL @ Tue, 28 Sep 2021 17:50:55:
Unexpected error, please check your input.
ERROR: 71 "
I was using 2.2.5 from last week, but let me know if there's a never one I should try.

Also, regarding the "force_merge_pairs" - I would imaging using this option or relaxing "min_pair_end_reads_overlap" to <10 could introduce some artefacts. Would you know what type of issues to watch out for? Thank you!

kclem · 2021-09-29T01:28:32Z

Sorry -- I just pushed a fix for the force_merge_pairs error. You can either wait for a next release in a week or two, or install from source.

The artifacts would be that if reads were merged, there could be deletions or insertions at the merge site.

For example, for:

ref: AATTCCGG
read1: AAT
read2: CGG
force-merged read:  AATCGG
Alignment: AAT--CGG
           AATTCCGG

In this case, there is an apparent deletion that is induced by the merging.

kclem added the enhancement New feature or request label Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

just align (not global alignment) PE reads (without FLASH) to the amplicon sequence and quantify indel rate? #120

just align (not global alignment) PE reads (without FLASH) to the amplicon sequence and quantify indel rate? #120

YichaoOU commented Aug 5, 2021

YichaoOU commented Aug 5, 2021

kclem commented Aug 5, 2021

YichaoOU commented Aug 5, 2021

eul2021 commented Sep 23, 2021

kclem commented Sep 24, 2021

eul2021 commented Sep 28, 2021

kclem commented Sep 29, 2021

just align (not global alignment) PE reads (without FLASH) to the amplicon sequence and quantify indel rate? #120

just align (not global alignment) PE reads (without FLASH) to the amplicon sequence and quantify indel rate? #120

Comments

YichaoOU commented Aug 5, 2021

YichaoOU commented Aug 5, 2021

kclem commented Aug 5, 2021

YichaoOU commented Aug 5, 2021

eul2021 commented Sep 23, 2021

kclem commented Sep 24, 2021

eul2021 commented Sep 28, 2021

kclem commented Sep 29, 2021