Releases: dariober/ASCIIGenome
v1.18.0
New in 1.18.0
- New
awk
functionsgetAlnEnd()
andgetAlnLen()
to get the end and
length of a SAM aligmment, respectively. Useful to filter for alignments
above/below a cutoff, especially handy with Nanopore reads. E.g.
awk 'getAlnLen() > 2000 && getAlnEnd() < 12345'
-
Use the operating system's
awk
instead of the built-in Java Jawk. OS's
awk
appears to be 5-10x faster than Jawk. The flip side is that we assume
users haveawk
on their PATH. -
Add configuration parameter
low_mapq
to set what you consider as low
mapping quality. Default is 5 which was the setting hardcoded until now. -
Use false in config
shade_structural_variant
to omit shading of
structural variants -
New command
nextChrom
moves to the next chromosome without the need of
typing its name. Useful to quickly flip through several chromosomes. -
Can read CRAM files (finally!)
-
New command
addHeader
inserts one or more lines of text before a track.
Useful to add a header or legend-like text to groups of tracks. -
Accept bed/bedgraph with space as column separator (see old issue #12)
-
print
decodes URL character to readable character (e.g. it prints,
instead of%2C
) -
Improved command
show genome
: Add option-n
to limit the number of
contigs; sort by size; add percentage and cumulative percentage of genome
covered by each contig; add indicator of current contig. E.g.:
show genome
Genome size: 3095693983; Number of contigs: 25
chr1 249250621 |||||||||||||||||||||||||||||| 8.1%; 8.1%
chr2 243199373 ||||||||||||||||||||||||||||| 7.9%; 15.9%
chr3 198022430 |||||||||||||||||||||||| 6.4%; 22.3%
chr4 191154276 ||||||||||||||||||||||| 6.2%; 28.5%
chr5 180915260 |||||||||||||||||||||| 5.8%; 34.3%
chr6 171115067 ||||||||||||||||||||| 5.5%; 39.9%
chr7 159138663 ||||||||||||||||||| 5.1%; 45.0% <==
chrX 155270560 ||||||||||||||||||| 5.0%; 50.0%
chr8 146364022 |||||||||||||||||| 4.7%; 54.7%
chr9 141213431 ||||||||||||||||| 4.6%; 59.3%
chr10 135534747 |||||||||||||||| 4.4%; 63.7%
chr11 135006516 |||||||||||||||| 4.4%; 68.0%
chr12 133851895 |||||||||||||||| 4.3%; 72.4%
chr13 115169878 |||||||||||||| 3.7%; 76.1%
chr14 107349540 ||||||||||||| 3.5%; 79.5%
chr15 102531392 |||||||||||| 3.3%; 82.9%
chr16 90354753 ||||||||||| 2.9%; 85.8%
chr17 81195210 |||||||||| 2.6%; 88.4%
chr18 78077248 ||||||||| 2.5%; 90.9%
chr20 63025520 |||||||| 2.0%; 93.0%
chrY 59373566 ||||||| 1.9%; 94.9%
chr19 59128983 ||||||| 1.9%; 96.8%
chr22 51304566 |||||| 1.7%; 98.4%
chr21 48129895 |||||| 1.6%; 100.0%
chrM 16571 0.0%; 100.0%
v1.17.0
New in 1.17.0
-
print
shows column separated by green|
(more readable) -
BEDGRAPH format is an extension of BED. This means lines in bedgraph lines can be printed with
print
and filtered withgrep
&awk
-
Add command
bedToBedgraph
to switch from BED to BEDGRAPH and viceversa -
Speed improvements to filtering with
awk
-
Command line argument
--showMemTime
replaces--showMem
and--showTime
v1.16.0
New in 1.16.0
-
Important Reading TDF, bigWig, and bigBed from remote URL is no longer possible; local
files are ok. This is because the new API of htsjdk is incompatible with IGV
v2.6. Upgrading IGV is causing problems between theirs and our custom htsjdk. -
print -hl
command can highllight by column position, e.g.print -hl '$3, $10'
-
Can use CSI index for BAM files.
-
featureColorForRegex
, renamed tofeatureColor
, now accepts as expression
a regex (as before) or an awk script. Awk is useful to color features
according to some numeric values. E.g., in a narrowPeak file you can
highlight features with qvalue > x:featureColor -r '$9 > 3' blue -r '$9 > 6' red
-
Quoting: in addition to single quotes, command arguments can be delimited
by double quotes"
, tripe single quotes'''
or triple double
quotes"""
(similar to python). For example, grep records containg single
quotes:grep -i "'" or grep -i """'"""
-
Add navigation commands [ and
]
to move window by screen column. -
File path in track title is shown as relative to current working directory
and simplified. -
gffNameAttr
can rename also bed features. It has been renamed to the more
comprehensivenameForFeatures
. The name to display for bed feature can be
assigned by passing tonameForFeatures
the column index to use. This is
particularly useful to show metrics of interest in e.g. narrowPeak peak
files. -
Document the special flag 4096 in
samtools
command which selects for TOP
STRAND reads. Useful for stranded RNA-Seq and BS-Seq libraries. -
orderTracks
can put selected tracks last. First select all tracks with e.g.
.
, then list those you want last:orderTracks . #1 #2
-
goto
understands target region separated by spaces (issue #93). Useful to copy and
paste regions from tables text files. E.g.,goto chr7 10 200
.
v1.15.0
New in 1.15.0
-
Change behaviour of
bookmark
command. The argument to bookmark the region can have the chromosome prefix<chr>:
omitted.
In such case, use the current chromosome. The commandbookmark 100
will bookmark position 100 on current chromosome while
bookmark chr1
will fail instead of bookmarking the entire chromosome. -
Command help can be invoked also with
?my_cmd
orhelp my_cmd
-
Various issues fixed
v1.14.0
New in 1.14.0
Java version 1.8 is now required
-
Update htsjdk to version 2.14. As before, htsjdk has been modifed to be more lenient on input validation. See here and
build.gradle
for the exact version loaded. IGV package also updated to 2.4.10. -
Fixed bug causing base quality shading to shift right with soft clipping.
-
Fixed bug in checking latest version on repository.
-
Fix reading configuration file from command line.
-
Fix an off-by-one error in
find
command with indels. -
Fix at least some issues running on Windows have been fixed (see issue #83).
v1.13.0
New in 1.13.0
-
Commands
INT
andPERCENT
accept the suffixc
to put the position INT or PERCENT right at the center of the screen.
Followed by the commandzi
orzo
, this is useful to quickly zoom-in into a peak or variant of interest. -
The highlighting of the mid-character in read tracks can be turned off
setConfig highlight_mid_char false
. -
Fixed bug where Stopwatch in
TrackProcessor
was started when already running. This happened after an uncaught exception. -
Fixed in
print -hl
causing an out-of-bound index accessing VCF samples with incomplete fields. -
New command reload updates the current view after a file has been modified. This is useful when you are
experimenting with files and you want to quickly see them updated in ASCIIGenome. -
Reads can be shown as
>
and<
chracters also at single base resolution via configuration keynucs_as_letters
. -
Second-in-pair reads are shown underlined also when base pair resolution is greater than 1.
-
Easier setting of configuration. The configuration key is fuzzy matched so it doesn't have to be spelt in full.
-
print has option
-esf
to explain SAM flags. -
Fix minor bug:
%r
insave
command is expanded tochr_from_to
, consistent with save inprint
. Before%r
expanded tochr_from-to
. -
Enable comments in command line with
//
. E.ggoto chr1 // A comment
Refactor
v1.12.0
Bug fixes
-
Fixed bug where initialisation failed with VCF or SAM files with no records.
-
Fixed bug causing (some) tracks to be processed even when their height was set to zero.
-
Temporary files are written to the current dir by default and only as a fallback to the system's tmp directory. This is to reduce the risk of filling up the
/tmp/
partition, which usually is quite small. -
find explicitly informs the user that no match was found when the searched pattern returns no matches.
New features
-
Command filterVariantReads correctly interprets cigar operators
=
andX
. -
Command filterVariantReads intreprets intervals and offsets. E.g.
filterVariantReads -r 1000+/-10
orfilterVariantReads -r 1000+10
. NB In contrast to v1.11.0, select an interval using colon-r from:to
. The minus-
sign will subtract the offset from the first positions. -
Add
-all
option to filterVariantReads to retain all reads intersecting interval, not just the variant ones. -
awk includes a built-in function,
get(...)
, to retrieve GFF, GTF, SAM or VCF attribute tags from the respective files. -
print rounds numbers to n decimal places via the
-round
option. In this way the printed lines are more readable. -
-clip
mode in print gives more readable output. Long SEQ and QUAL fields in bam reads and long REF and ALT sequences in vcf are also clipped since typically you don't want to read long sequences and quality strings. Also, long strings like Oxford Nanopore or PacBio CIGAR strings are shortened.-full
mode still returns the whole shebang which combined withprint -sys 'cut ...'
(or similar) gives readable output. -
-hl
option in print can highlight matches to a regular expression, similar to vim/
or (CTRL-F
/CMD-F
in many GUI programs). In addition, regexes matching a FORMAT tag in VCF records highlight the tag AND the corresponding values. This is useful to quickly scan a sample property across samples. For example, here we highlight the AD format tag in two samples:
-
Command
addTracks
renamed to more conventional open.addTracks
is still recognized as an alias. -
Invalid bedgraph records are silently skipped. This is to allow tables with NA or similar to be loaded.
-
setGenome executed without arguments (tries to) load the last opened fasta file.
-
find and
grep now match in case insensitive mode by default. Use flag-c
to enabled case sensitivity. In addition, flag-F
matches literal strings, not regex. -
New command explainSamFlag to quickly make sense of SAM flags. Similar to picard/explain-flags. Example:
[h] for help: explainSamFlag 99 173 3840
99 173 3840
X X . read paired
X . . read mapped in proper pair
. X . read unmapped
. X . mate unmapped
. . . read reverse strand
X X . mate reverse strand
X . . first in pair
. X . second in pair
. . X not primary alignment
. . X read fails platform/vendor quality checks
. . X read is PCR or optical duplicate
. . X supplementary alignment
- Present a suggestion when a misspelt command is issued. E.g.:
[h] for help: prnt
Unrecognized command: prnt
Maybe you mean print?
v1.11.0
New in 1.11.0
Speed improvements
-
Printing tracks of bam alignments. Depending on the file system, the improvement can be quite large, now taking milliseconds instead of seconds. Explanation: The library size of a bam file was recalculated each time the screen was refreshed. This can be very fast on some systems but on others it can take up to a few seconds.
-
Following from previous point: library size is not calculated by default. This can make ASCIIGenome faster in loading bam files.
-
Some speed improvement in processing BAM tracks. The improvement is more noticeable when loads of reads are processed. For example, a window spanning 85 kb and containing ~2 million reads takes ~35 sec in this version compared to ~1:30 min in v1.10.0.
-
Pileup data is cached so that it doesn't need to be recalculated. This makes commands like
f/b/ff/bb
andzi
much faster. -
Setting a reference fasta sequence via
-fa
orsetGenome
is faster due to lazy loading of reference sequence. I.e., a sequence is retrieved from file only if requested. The speed advantage may or may not be noticeable, depending on hardware/filesystem.
Additions & bug fixes
-
New command filterVariantReads selects reads with a mismatch in a given reference position. Useful to inspect reads supporting alternate alleles. NB It cannot handle cigar operators
=
andX
- fixed in v1.12. -
Insertions in reads are visible. The base preceding an insertion has fore/background colour inverted.
-
Indel variants start from POS+1 if the first base of the variant equals the reference.
-
grep
applies also to bam tracks. -
Add
-c
option tonext
command. Useful for browsing small features such as SNV and indels. -
Fix bug where shaded base qualities were occasionally shifted.
-
Re-established compatibility with Java 1.7. Release 1.10.0 was accidently compiled for Java 1.8.
-
Validation of VCF files is more relaxed. The original validation imposed by htsjdk is very stringent causing files to be rejected for minor bugs.
-
genotype
matrix prints samples in the same order as in the VCF instead of using alphanumeric order.
v1.10.0
New in 1.10.0
There are several additions in this release:
-
VCF: better representation of structural variants. Previous versions had very limited support for SV.
-
readsAsPairs
command can show paired-end reads joined up. -
featureColorForRegex
can set colour for features NOT matching a regex. Useful to dim features without
completely hiding them. -
awk
recognises column headers for bam, vcf, gtf/gff, bed tracks, like$POS $START $FEATURE
etc, similar to bioawk. -
A pair of single quotes (
''
) at the command prompt is understood as an empty argument. -
SAM and BAM files without index are now acceptable input. They are sorted and indexed to temporary file and then loaded (this of course can take
a long time for large files). -
addTracks
accepts a list of file indexes pointing to the list of recently opened files.
v1.9.0
New in 1.9.0
-
New command
featureColorForRegex
sets colour of individual features. For example, to have UTRs shown in yellow and CDSs in green use in a GTF track you could use:featureColorForRegex -r UTR yellow -r CDS green
-
Add command
PERCENT [PERCENT]
to zoom into a region on the current window. E.g..25 .5
moves to the second quarter of the current window. This command deprecates the suffixc
in command commandINT [INT]
. -
On exit via
q
command the screen is cleared. This avoids leaving the the screen in mixed colours. -
Fixed some minor bugs.