Modified base calls are qualified with a probability that is contained in the ML tag (see the
SAM tags specification. We calculate the confidence that the model
has in the base modification prediction as
Filtering in modkit
is performed by
first determining the value of pileup
, a region to sample the reads from can be specified with the --sample-region
option. The
sample-probs
sub-command is specifically taylored to investigate model confidence values at different
percentiles.
Once a threshold value has been determined base modification calls with a confidence value less than this
number will not be counted. Determination of the threshold value can be performed on the fly (by sampling,
described above) or the threshold value can be specified on the command line with the --filter-threshold
flag.