Steps on the road from RNAseq to variant protein sequence #3

iskandr · 2016-02-04T23:03:51Z

The handful of algorithms I've sketched out and discussed with @armish and @ryan-williams, there are broadly two steps:

Gather variant RNAseq reads: Either using gapped alignment to the reference or by generating a set of phased candidate haplotype sequences around the variant (all combinations of nearby variants) and using either k-mer lookups or an FM-index to find matching reads.
Assemble partial transcript sequence(s): In the "naive" version @armish is working on we get a sequence by taking the most common nucleotide at each offset from the variant. We could also build an overlap graph of the filtered reads and assemble multiple candidate sequences. Each sequence should be accompanied by an abundance estimate.

Third step (which may live in this package or in PGV):

To determine a partial protein sequence from an assembled sequence, it needs to be placed in a reading frame. The easiest way I can think of doing this is to use the known reading frame of annotated transcripts overlapping the variant locus which match the assembled sequence before the variant. We can't expect to match after the variant due to exon truncation or intron retention (i.e. variant splicing).

iskandr · 2016-03-30T21:24:27Z

Abandoning full-blown assembly in favor of only doing local phasing within a single read length. Briefly discussed with @JPFinnigan the possibility of trying to do limited assembly of the sequence between mate pairs but I'll leave that for future work.

iskandr closed this as completed Mar 30, 2016

iskandr mentioned this issue Dec 13, 2016

Make phasing of multiple variants on a read explicit (group reads into phase groups) #72

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Steps on the road from RNAseq to variant protein sequence #3

Steps on the road from RNAseq to variant protein sequence #3

iskandr commented Feb 4, 2016

iskandr commented Mar 30, 2016

Steps on the road from RNAseq to variant protein sequence #3

Steps on the road from RNAseq to variant protein sequence #3

Comments

iskandr commented Feb 4, 2016

iskandr commented Mar 30, 2016