Skip to content

[MIRROR] Virulence factors, other sequences of interest & Islands. For ST131 99 manuscript. SeqFindR formatted

Notifications You must be signed in to change notification settings

BeatsonLab-MicrobialGenomics/VFDB

Repository files navigation

VFDB

This repository provides the sequences relating to virulence factors and regions of interest used in the publication "Global dissemination of a multidrug resistant Escherichia coli clone":

PETTY NK*, BEN ZAKOUR NL*, STANTON-COOK M, SKIPPINGTON E, TOTSIKA M, FORDE BM,
PHAN MD, MORIEL D, PETERS KM, DAVIES MR, ROGERS BA, DOUGAN G, RODRIGUEZ-BAÑO J,
PASCUAL A, PITOUT JDD, UPTON M, PATERSON DL, WALSH TR, SCHEMBRI MA^, BEATSON
SA^. Global dissemination of a multidrug resistant Escherichia coli clone.
Proc Natl Acad Sci USA (in press).

^CORRESPONDING AUTHORS *AUTHORS CONTRIBUTED EQUALLY

In addition, please see the repository providing draft assemblies for 99 Escherichia coli strain of sequence type 131.

These sequences, the assemblies and the tool SeqFindR were used to generate Figures 3, S4 and S6 in the publication above. The sequences here are formatted in the SeqFindR database format.

Databases used in Global dissemination of a multidrug resistant Escherichia coli clone

The following explains how the databases map to the figures:

Figure SeqFindR Database
Figure 3 Islands_200bp_chunks.fa
Figure S4 plasmid_replicons.fa
Figure S6 Schembri_VFDB.fa

Islands_200bp_chunks.fa

The sequences are from Dataset S1 of Totsika et al, Insights into a Multidrug Resistant Escherichia coli Pathogen of the Globally Disseminated ST131 Lineage: Genome Analysis and Virulence Mechanisms, DOI:10.1371/journal.pone.0026578. We chunked each element into 200 bp subsequences to maintain scale and also to consider our draft Illumina assemblies.

plasmid_replicons.fa

Sequences of 19 plasmid replicon types.

Schembri_VFDB.fa

Is a concatenation of the following sub_databases: Autotransporters.fa (42), CU_fimbriae.fa (38), Iron_uptake.fa (15), Other_virulence_genes.fa (30), Toxins.fa (4) and UPEC_specific_genes.fa (125).

A CSV formatted file providing detailed information for each sequence is available here.

Additional databases

Colicins_and_microcins.fa: Sequences for 21 representative Colicins and Microcins.

ST131_publishedPlasmids.fa and ST131_publishedPlasmids_500bp_chunked.fa: 7 published plasmids found associated with ST131 genomes. These were chunked into 500 bp subsequences.

ST131_AllPlasmids.fa and ST131_AllPlasmids_500bp_chunked.fa: As above but includes pEC958.

fimB.fa: Wild type fimB and fimB insertion.

CTX-M.fa: 122 CTX-M-* genes extracted from on Lahey Hospital and Medical Centre database on the 07/08/2013.

Antibiotics.fa and Antibiotics_EC958_CTX-M-15.fa: 30 short 70'mers to screen antibiotic profiles. In Antibiotics_EC958_CTX-M-15.fa we have switched out 70-ctx143, bla-CTX-M-1 for EC958_CTX-M-15.

Scripts

We provide chunk_mfa.py which was used to chunk Island elements into 200 bp subsequences.

chunk_mfa.py is used like this:

$ python chunk_mfa.py input.fa 200 > outname.fa
#
# where:
#   * input.fa is 1 or more fasta formatted sequences
#   * 200 can be any integer. This will be your sub sequence length
#   * outname.fa is the name you want your final output to be.

About

[MIRROR] Virulence factors, other sequences of interest & Islands. For ST131 99 manuscript. SeqFindR formatted

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published