-
Notifications
You must be signed in to change notification settings - Fork 15
blobtools filter
Richard Challis edited this page Dec 22, 2023
·
1 revision
Datasets can be filtered based on the values in any variable or category field, or using a list of identifiers. Filters may be applied to a complete dataset to allow for use of a reduced dataset without repeating analyses or applied to assembly FASTA and read FASTQ files to allow for reassembly and reanalysis. Filter parameters are shared between blobtools
and the interactive Viewer, allowing interactive sessions to be reproduced on the command line.
Filter a BlobDir.
Usage:
blobtools filter [--param STRING...] [--query-string STRING] [--json JSON]
[--list TXT] [--invert] [--output DIRECTORY]
[--fasta FASTA] [--fastq FASTQ...] [--suffix STRING]
[--cov BAM] [--summary FILENAME] [--summary-rank RANK]
[--table FILENAME] [--table-fields STRING]
[--taxdump DIRECTORY] [--taxrule STRING] [--text TXT] [--text-header]
[--text-delimiter STRING] [--text-id-column INT] DIRECTORY
Arguments:
DIRECTORY Existing BlobDir dataset directory.
Options:
--param STRING String of type param=value.
--query-string STRING List of param=value pairs from url query string.
--json JSON JSON format list file as generated by BlobtoolKit Viewer.
--list TXT Space or newline separated list of identifiers.
--invert Invert filter (exclude matching records).
--output DIRECTORY Path to directory to generate a new, filtered BlobDir.
--fasta FASTA FASTA format assembly file to be filtered.
--fastq FASTQ FASTQ format read file to be filtered (requires --cov).
--cov BAM BAM/SAM/CRAM read alignment file.
--text TXT generic text file to be filtered.
--text-delimiter STRING text file delimiter. [Default: whitespace]
--text-id-column INT index of column containing identifiers (1-based). [Default: 1]
--text-header Flag to indicate first row of text file contains field names. [Default: False]
--suffix STRING String to be added to filtered filename. [Default: filtered]
--summary FILENAME Generate a JSON-format summary of the filtered dataset.
--summary-rank RANK Taxonomic level for summary. [Default: phylum]
--table FILENAME Tabular output of filtered dataset.
--table-fields STRING Comma separated list of field IDs to include in the
table output. Use 'plot' to include all plot axes.
[Default: plot]
--taxdump DIRECTORY Location of NCBI new_taxdump directory.
--taxrule STRING Taxrule used when processing hits.