-
Notifications
You must be signed in to change notification settings - Fork 2
General metrics
The General Metrics facet reports general statistics about the records contained within the file. The report is delivered at under the general
key within the results.json
file. You can easily examine the output of the general facet by using jq
:
cat results.json | jq .general
This facet has the following top-level keys,
Key | Description |
---|---|
records |
Metrics regarding record counts, including total number of records, unmapped records, duplicate records, the designation of records (primary, secondary, supplementary), how many paired records exist, how many read one and read two records exist, how many records are properly paired, how many singleton records exist, how many record's mate is mapped to a different sequence (both unfiltered and high-quality). |
cigar |
Metrics regarding the pileups of Cigar counts for both read ones and read twos. |
summary |
Contains summary metrics for this facet, including duplication record percentage, the unmapped record percentage, and the percentage of records whose mate is mapped to another sequence (both unfiltered and high-quality). |
This section of the general metrics comprises multiple general counting metrics regarding records. Many of these counts are simply cycling through the reads and counting up reads with particular flags. This is similar to the functionality you would get with a samtools flagstat
command. The current set of record metrics collected include:
-
Total (
total
). The total number of records within the file. -
Unmapped (
unmapped
). The total number of records marked as unmapped (0x4
) within the file. -
Duplicate (
duplicate
). The number of records marked as duplicate (0x400
) within the file. -
Designation (
designation
). The number of primary, secondary, and supplementary records in the file respectively.- If a read is marked as secondary (
0x100
), then the read is counted as secondary. - Else, if a read is marked as supplementary (
0x800
), then the read is counted as supplementary. - Else, the read is counted as primary.
- If a read is marked as secondary (
Past this point, only records designated as primary are counted towards the following metrics.
-
Primary mapped (
primary_mapped
). The number of records that are counted as primary and and are marked as mapped (!0x4
). -
Primary duplicate (
primary_duplicate
). The number of records that counted as primary and are marked as duplicate (0x400
).
Past this point, only records that are designated as primary and marked as segmented (0x01
) are counted towards the following metrics.
-
Paired (
paired
). The number of records that are designated as primary and marked as segmented (0x01
). -
Read 1 (
read_1
). The number of records that are designated as primary, marked as segmented (0x01
), and marked as being the first record within a segment (0x40
). -
Read 2 (
read_2
). The number of records that are designated as primary, marked as segmented (0x01
), and marked as being the last record within a segment (0x80
).
Past this point, only records that are designated as primary, marked as segmented (0x01
), and marked as mapped (!0x04
) are counted towards the following metrics.
-
Proper pair (
proper_pair
). The number of records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), and properly aligned (0x2
). -
Singleton (
singleton
). The number of records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), and marked as mate is unmapped (0x08
).
Past this point, only records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), and the mate is marked as mapped (0x08
) are counted towards the following metrics.
-
Mate mapped (
mate_mapped
). The number of records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), and marked as mate is mapped (!0x08
).
Past this point, only records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), the mate is marked as mapped (0x08
), and the mate is mapped to a different sequence are counted towards the following metrics.
-
Mate mapped with reference sequence mismatch (
mate_reference_sequence_id_mismatch
). The number of records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), marked as mate is mapped (!0x08
), but the sequence id that the mate is matched to is different that the record being examined. -
Mate mapped with reference sequence mismatch (high-quality) (
mate_reference_sequence_id_mismatch_hq
). The number of records that are designated as primary, marked as segmented (0x01
), marked as mapped (!0x04
), marked as mate is mapped (!0x08
), but the sequence id that the mate is matched to is different that the record being examined and the mapping quality of the current record is greater than 5.
Cigar metrics are generally pileups of Cigar operations for every record in the file.
-
Read one cigar ops (
read_one_cigar_ops
). Pileup of the Cigar operations for all read ones. -
Read two cigar ops (
read_two_cigar_ops
). Pileup of the Cigar operations for all read twos.
Summary metrics are generally percentages that are of interest to users. The current set of summary metrics collected include:
-
Duplication percentage (
duplication_pct
). The percentage of records that are marked as duplicate (0x400
) in the file. -
Unmapped percentage (
unmapped_pct
). The percentage of records that are marked as unmapped (0x04
) in the file. This also allows one to trivially calculate the mapped percentage. -
Mate reference sequence mistmatch percentage (
mate_reference_sequence_id_mismatch_pct
). The number of records counted as "Mate mapped with reference sequence mismatch" divided by the total number of records in the file as a percentage. -
Mate reference sequence mistmatch percentage (high-quality) (
mate_reference_sequence_id_mismatch_hq_pct
). The number of records counted as "Mate mapped with reference sequence mismatch (high-quality)" divided by the total number of records in the file as a percentage.
-
Subcommands
ngs convert
ngs derive
ngs generate
ngs index
ngs list
ngs plot
-
ngs qc
- Record-based Facets
- Sequence-based Facets
ngs view
- Development