Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mob-typer returns different results when plasmids were rotated. #85

Closed
ryotag opened this issue Apr 13, 2021 · 3 comments
Closed

Mob-typer returns different results when plasmids were rotated. #85

ryotag opened this issue Apr 13, 2021 · 3 comments

Comments

@ryotag
Copy link

ryotag commented Apr 13, 2021

Hi,
Thanks for developing this nice tool!

I got different results for the same plasmid using MOB-typer.
When I ran mob_typer --multi --infile KX912253_RC.fasta --out_file KX912253_RC.tsv, I got the results as bellow (no plasmid replicons, relaxase, mpfs detected):

sample_id	num_contigs	size	gc	md5	rep_type(s)	rep_type_accession(s)	relaxase_type(s)	relaxase_type_accession(s)	mpf_type	mpf_type_accession(s)	orit_type(s)orit_accession(s)	predicted_mobility	mash_nearest_neighbor	mash_neighbor_distance	mash_neighbor_identification	primary_cluster_id	secondary_cluster_id	predicted_host_range_overall_rank	predicted_host_range_overall_name	observed_host_range_ncbi_rank	observed_host_range_ncbi_name	reported_host_range_lit_rank	reported_host_range_lit_name	associated_pmid(s)
FRI-2_plasmid_KX912253-RC_Enterobacter_asburiae_strain_H162620587_plasmid_pJF-587__complete_sequence.	1	108672	51.18429770318021	5bd1577e5eae2824bbb7eb4e9ed6c126	-	-	-	non-mobilizable	KX912253	0.0	Enterobacter asburiae	AA414	AI467	genus	Enterobacter	genus	Enterobacter	-	-	-

However, when I rotated the plasmid and ran mob_typer --multi --infile KX912253_RC_rotated.fasta --out_file KX912253_RC_rotated.tsv, I got the following results (now plasmid replicons, a relaxase, and mpf were detected):

sample_id	num_contigs	size	gc	md5	rep_type(s)	rep_type_accession(s)	relaxase_type(s)	relaxase_type_accession(s)	mpf_type	mpf_type_accession(s)	orit_type(s)orit_accession(s)	predicted_mobility	mash_nearest_neighbor	mash_neighbor_distance	mash_neighbor_identification	primary_cluster_id	secondary_cluster_id	predicted_host_range_overall_rank	predicted_host_range_overall_name	observed_host_range_ncbi_rank	observed_host_range_ncbi_name	reported_host_range_lit_rank	reported_host_range_lit_name	associated_pmid(s)
FRI-2_plasmid_KX912253-RC_concatenated	1	108672	51.18429770318021	4278a4e947e4d787148b957d35f4c27d	IncFII,IncR	CP019890_00139,000207__CP025517	MOBF	NC_014107_00160	MPF_F	NC_014107_00125,NC_014107_00126,NC_014107_00127,NC_009425_00108,NC_014107_00135,NC_014107_00139,NC_014107_00145,NC_014107_00146,NC_014107_00154,NC_014107_00155,NC_014107_00137,NC_014107_00159	-	-	conjugative	KX912253	0.0	Enterobacter asburiae	AA414	AI467	order	Enterobacterales	order	Enterobacterales	family	Enterobacteriaceae	20851899; 23711894

I thought the results can slightly be different after the rotation since rotation can recover a broken gene (i.e., if the gene spans the beginning and end of the contig, this gene can be recovered by the rotation).
However, the results are completely different in this case and most genes used for the typing seem to be intact before the rotation.
I've attached fasta files of the plasmid before/after the rotation. (the file extension is .txt since github does not allow me to attach files with .fasta)
KX912253_RC.txt
KX912253_RC_rotated.txt

If you have any ideas, please let me know.

Thank you,

@ryotag
Copy link
Author

ryotag commented May 4, 2021

It would be helpful if someone could tell me this issue is reproducible or not.
I'm using mob_typer 3.0.0 on my Mac (macOS High Sierra version 10.13.6).

Thank you,

@ryotag
Copy link
Author

ryotag commented Jun 22, 2021

I finally found the reason for this.
I changed the header of the fasta file from
>FRI-2_plasmid_KX912253-RC Enterobacter asburiae strain H162620587 plasmid pJF-587, complete sequence.
to
>FRI-2_plasmid_KX912253-RC
, and I got the correct results for the original fasta file without rotations.
I think this is an important bug to be fixed, because MOB-typer cannot detect plasmid replicons/relaxases and returns wrong results for fasta files with certain headers.

@jrober84
Copy link
Collaborator

There seems to be some issues with blast and length of headers. I have implemented a fix in 3.1.0 where all sequences are renamed internally for all of the blast and search calls. Then reported back as the original sequence identifiers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants