Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map variants down to chromosomal location + allele #15

Open
jhkbg opened this issue Nov 30, 2016 · 0 comments
Open

Map variants down to chromosomal location + allele #15

jhkbg opened this issue Nov 30, 2016 · 0 comments
Assignees

Comments

@jhkbg
Copy link
Collaborator

jhkbg commented Nov 30, 2016

Just an idea for next steps: SETH could, starting from an amino acid change, compute the underlying CDS and DNA change(s) and return them as well. This makes it much easier to integrate SETH output across millions of papers to search for specific variants (BRAF V600E is the same as BRAF c.1779T>A is the same as chr7:140453136A>T).

In particular, this would be great to annotate VCFs with papers, for which we will need the exact allele(s). Note that an amino acid change can arise from many DNA changes. Obviously, we should occam the possibilities to only the most likely ones, preferring SNVs over MNVs over complicated insdels, for example.

We probably don't have to re-invent that, but rather write wrappers around tools such as transvar (http://transvar.readthedocs.io) or Counsyl HGVS (https://github.com/counsyl/hgvs). I've had good experience with transvar. They are all in Python though.

I'll update here ones I'm done wrapping transvar around SETH output. My current pipeline is 1) run SETH NER, bulk import to MySQL DB; 2) run SETH NEI with dbSNP 147 on MySQL; 3) export all amino acid variants (CDS TBD) and annotate with Transvar; 4) load back into MySQL 😆

@jhkbg jhkbg self-assigned this Dec 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants