-
Notifications
You must be signed in to change notification settings - Fork 4
File Formats
Please refer to the Configuration Files page for information about the format of FAVITES configuration files.
For robustness to future development, we designed a file format similar to an edge list that must be used for the input Contact Network. The first portion of the file is a list of nodes, and the second portion is a list of edges.
- "Node" lines have three tab-delimited sections:
- NODE (i.e., just the string
NODE
) - This node's label
- Attributes of this node as comma-separated values, or a period (i.e.,
'.'
) if this node has no attributes
- NODE (i.e., just the string
- "Edge" lines have five tab-delimited sections:
- EDGE (i.e., just the string
EDGE
) - The label of the node from which this edge leaves
- The label of the node to which this edge goes
- Attributes of this edge as comma-separated values, or a period (i.e.,
'.'
) if this edge has no attributes -
d
(directed) oru
(undirected) to denote whether or not this edge is directed (i.e.,u -> v
vs.u <-> v
)
- EDGE (i.e., just the string
- Lines beginning with the pound symbol (i.e.,
'#'
) and empty lines are ignored
Below is an example of this file format. Note that <TAB>
is referring to a single tab character (i.e., '\t'
).
#NODE<TAB>label<TAB>attributes (csv or .)
#EDGE<TAB>u<TAB>v<TAB>attributes (csv or .)<TAB>(d)irected or (u)ndirected
NODE<TAB>Bill<TAB>USA,Mexico
NODE<TAB>Eric<TAB>USA
NODE<TAB>Curt<TAB>.
EDGE<TAB>Bill<TAB>Eric<TAB>.<TAB>d
EDGE<TAB>Curt<TAB>Eric<TAB>Friends<TAB>u
The file format of the transmission networks that are outputted by FAVITES are in the standard edge list format. Each line represents a single edge via three tab-delimited attributes:
- The label of the node from which this edge leaves
- The label of the node to which this edge goes
- The time at which this transmission event occurred
Self-edges (i.e., same node in columns 1 and 2) denote removal of infection, either via recovery or death. Edges with None
in column 1 denote seed infections (i.e., infections from outside the population).
Below is an example of this file format. Note that <TAB>
is referring to a single tab character (i.e., '\t'
).
None<TAB>Eric<TAB>0
Eric<TAB>Bill<TAB>1
Eric<TAB>Curt<TAB>2
Eric<TAB>Curt<TAB>3
Curt<TAB>Bill<TAB>4
Curt<TAB>Bill<TAB>5
Curt<TAB>Curt<TAB>6
When FAVITES outputs viral lineages in the sequence files and in the phylogenetic trees, the identifiers are in the format viral_lineage|contact_network_node|time
, e.g. N19|67|4.118017
.
-
viral_lineage
: Each viral lineage in the simulation process has its own unique identifier for ease of identification -
contact_network_node
: The contact network individual from whichviral_lineage
was sampled -
time
: The time at whichviral_lineage
was sampled fromcontact_network_node
Some modules may require that you pass in the desired seed nodes by file (the seed_file
parameter). This file should just contain the seed node names, delimited by newlines. Below is an example of this file format.
Eric
Bill
The file format of sample times that can be used with FAVITES are in a simple tab-delimited format. Each line represents a single sample time via two tab-delimited attributes:
- The label of the node to be sampled
- The sample time
Multiple sample times can be specified per person by simply having multiple lines for that person. Below is an example of this file format. Note that <TAB>
is referring to a single tab character (i.e., '\t'
).
Eric<TAB>1
Eric<TAB>2
Bill<TAB>3
Niema Moshiri & Siavash Mirarab 2016