-
Notifications
You must be signed in to change notification settings - Fork 6
/
CHANGELOG
110 lines (110 loc) · 5.73 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
Version 1.0.10-11:
- Added: now printing a summary log with the importat parameters of the simulation.
- Added: random seed option in the general block.
- Added: the general random seed allows to automatically set up the indelible's control random seed when using inputmode in [1,2,3].
- Added: use cases for variant recovery and tree inference.
- Added: for input mode 2 and 3, where a sequence (anchor or ancestral) is introduced, there's a base frequency calculation to input into indelible, in case it is not explicit in the indelible-ngsphy control file.
- Update: on re-rooting. Now the tree is re-rooted on a dichotomy, with the outgroup branch length 0.
- Update: on NGS mode == readcounts, the output VCF file, renaming the ID for each record of the file.
- Update: now NGSphy produces folder with more descriptive names.
- Update: now possible to decide within [ngs-reads-art] options, whether coverage is per amplicon (fcov) or the total number of reads (rcount).
- Fixed: coverage calculation for diploid individuals updated.
- Fixed: coverage distribution parameterization.
- Fixed: a bug related to coverage computations. The indexes are now within the ranges.
- Fixed: a bug which modified the sequences generated from INDELible with the anchor sequence. This resulted in adding the wrong line
of the anchor file to the sequence dictionary, altering the resulting individual and generating a void FASTQ file.
- Fixed: (read counts) There was a problem in the output VCF file for read counts. Header was incorrect and CHROM id was related to the replicate/locus info and not to the name (description) of the reference sequence. File runs properly in tools like `vcftools` and CHROM field is now the description of the reference allele sequence.
- Fixed: (read counts) There was not an initialization in some NumPy `chararray`s and this was generating weird symbols, which didn't properly generate the data that was written
down in the VCF files. This was a particular situation for `ploidy == 2`. This has now been fixed.
Version 1.0.5:
- trying to fix the gene-tree distribution input mode. Basepath was not being assigned.
Version 1.0.4:
bug fixes:
- while using gene-tree distributions, output and input folders where mixed and readings were being made incorrectly. Now fixed.
Version 1.0.3:
bug fixes:
- runART option was not retrieved from the settings, so automatically it was not running ART. now fixed.
- runs correctly now for input mode 4.
- coverage matrix floating points now 3 (before 4)
- when handling ancestral sequence file, the access to the sequence was wrong.
- some variable names were misspelled.
Version 1.0.2
bug fixes:
- checking existence of INDELible control file: path generated was badly formatted.
- checking branch lengths of the trees now works fine.
- the streams to check the dependencies of the third-party software now work fine.
- for the sequence generation, I was using a wrong variable name for settings, now works fine
- dendropy depency was missing in some files
- removed the extra folder generated in alignments
- error writing VCF files, all files were written in the no_error folder. fixed.
Version 1.0.1
pypi registration
Version 1.0.0
structured as packaged and github push
Version 0.2.2
log file only available when using debug log level. running time files
outputted only as asked.
Version 0.2.1
added new input mode.
Version 0.2.0
getting ready version. Renamed output folders.
Version 0.1.9
In-line documentation
Version 0.1.8
package organization
Version 0.1.7
Added coverage distributions
Version 0.1.6
Threading added for read-counts
Version 0.1.5
Threading added for ART
Version 0.1.4
Changed name repository to ngsphy
Version 0.1.3
Fully functioning. Allows the simulation of diploid individuals and posterior
NGS reads from a SimPhy Project folder. Also, log output (standard output) has some color on
for the different log call types.
Version 0.1.2
Changed name repository to simphy-ngs-wrapper
Version 0.1.1
Changed functionality of the script. It now takes a single prefix and a project
folder, iterates over all the species trees, takes the fasta files with the
selected prefix and mates those files outputting them into a folder
inside the project folder called INDIVIDUALS
Version 0.1.0
Generated "individuals" go now into a "individuals" folder, organized by
Species tree ID and Gene Tree ID, all files in a single folder is at all
manageable.
Reformatted the names of the generated files.
Mating file "db", now has an extra field. This field identifies the individual.
Single output folder for all the sequences.
All individuals have a specific name which allows not to get confused.
Version 0.0.9
Getting information about the species tree that have even/odd number
of individuals per species
Version 0.0.8
It is possible to write different number of output according to
preferences.
Version 0.0.7
Run is no longer finished if a replicate has no sequences
it will just skip it.
Version 0.0.6
Enhanced logger
Version 0.0.5
Added logging handler
Version 0.0.4
Multiple prefixes for simphy datasets
Automatically defining the format of the numbers
Version 0.0.3
Mating running, missing tests on larger group.
Formatter for bash added
outgroups are used with a single sequence
Version 0.0.2
Still issues with the outgroups.
Mating done
Added error coloring
I'm going to have problems when replicates are above 9999
Version 0.0.1
How to simulate 2 outgroups - is it better with outgroups or without outgroups.
SO - What's the meaning of this parameter in terms of outgroups
I'll omit the outgroups for this version