A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths.
Seqfu can be easily installed via Miniconda:
conda install -y -c bioconda seqfu
Telatin A, Fariselli P, Birolo G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. doi.org/10.3390/bioengineering8050059
The full documentation is available at: telatin.github.io/seqfu2
SeqFu - Sequence Fastx Utilities
version: 1.0.0
• count [cnt] : count FASTA/FASTQ reads, pair-end aware
• deinterleave [dei] : deinterleave FASTQ
• derep [der] : feature-rich dereplication of FASTA/FASTQ files
• interleave [ilv] : interleave FASTQ pair ends
• lanes [mrl] : merge Illumina lanes
• sort [srt] : sort sequences by size (uniques)
• stats [st] : statistics on sequence lengths
• cat : concatenate FASTA/FASTQ files
• grep : select sequences with patterns
• head : print first sequences
• rc : reverse complement strings or files
• tail : view last sequences
• view : view sequences with colored quality and oligo matches
Add --help after each command to print usage