-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong prediction of Shigella Boydii serotype 20 #6
Comments
Thank you Roxanne for your email. I’ll look at the script - it was able to identify S. Boydii 20 in our test.
We note that for a gene to be included in the typing scheme, a minimum of 20% gene coverage is required. I may come back at you requesting the raw data of your strain for a definitive explanation.
Thanks again for helping us improve our script.
Yun
… On Jan 25, 2022, at 11:46 AM, Roxanne Wolthuis ***@***.***> wrote:
Hi,
I am using the ShigaTyper tool to analyze multiple shigella subtypes. One of the subtypes I am interested in is Shigella Boydii serotype 20. I noticed that if there is a heparinase hit the tool is supposed to return Shigella boydii serotype 20(line 591).
We looked at the samples manually and matched the gene sequence to the sample sequence, as expected the gene is within the samples, but the ShigaTyper script does not seem to recognize these hits and instead identifies the samples as Shigella boydii serotype 1.
There might be more users that will get this wrong prediction so I was wondering if there is an explanation for this and whether it can be fixed.
Looking forward to a response!
Kind regards,
Roxanne
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you are subscribed to this thread.
|
Hi Yun, We used some public samples with accession numbers SRR3020611 & SRR5330512 (ENA). For these samples we don't find results on the Heparinase gene. Hope this could help explain the issue! Roxanne |
Hi Roxanne,
That’s very useful information SRR3020611 was the founding strain used to develop the script in jupyter (ipython). If the script in bio conda doesn’t recognize the heparinase gene, there must be some file corruption while we convert the files. I’ll look at it.
Yun
… On Jan 27, 2022, at 9:06 AM, Roxanne Wolthuis ***@***.***> wrote:
Hi Yun,
We used some public samples with accession numbers SRR3020611 & SRR5330512 (ENA). For these samples we don't find results on the Heparinase gene.
Hope this could help explain the issue!
Roxanne
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you commented.
|
Should a reference gene for If so there isn't one, and likely the cause of this issue |
Did some testing. Without
added this sequence for heparinase (https://www.ncbi.nlm.nih.gov/nuccore/CP016036.1?from=2803&to=4428&report=fasta&strand=2) to the
|
Haha final comment. Comparing the genes in
Of these I think only |
Hi @wolthuisr This should be fixed in v2 of Shigatyper. Cheers |
I see that the current version of shigatyper does not contain shigatoxins and enterotoxins like the later version I included in the paper. It was primarily because the output I originally envisioned using ipython/Jupyter notebook is different what most people prefer in a server environment. So the current shigatyper only gives you a single output of a serotype. (And we debated over whether we should included heparinase for S. boydii 20 in the paper or for another paper). I am not as code-savvy as the CFSAN guys or most ppl on Github. Please let me know how helpful/informative if the script output includes another column for toxins identified? |
Hi,
I am using the ShigaTyper tool to analyze multiple shigella subtypes. One of the subtypes I am interested in is Shigella Boydii serotype 20. I noticed that if there is a heparinase hit the tool is supposed to return Shigella boydii serotype 20(line 591).
We looked at the samples manually and matched the gene sequence to the sample sequence, as expected the gene is within the samples, but the ShigaTyper script does not seem to recognize these hits and instead identifies the samples as Shigella boydii serotype 1.
There might be more users that will get this wrong prediction so I was wondering if there is an explanation for this and whether it can be fixed.
Looking forward to a response!
Kind regards,
Roxanne
The text was updated successfully, but these errors were encountered: