-
Notifications
You must be signed in to change notification settings - Fork 15
FASTQ validation
The FASTQ file validator (fastq_info) expects that the fastq files have four lines per sequence:
Sequence Identifier Line
Raw Sequence Line
Plus Line
Quality String Line
The following checks are performed on each sequence:
-
Sequence Identifier Line - Line starts with an '@' - Should have a unique identifier. - Line is at least 2 characters long ('@' and at least 1 for the sequence identifier)
-
Raw Sequence Line - Minimum length: 1 - Maximum length: 1M - Allowable characters: A C T G U N a c t g u n 0 1 2 3 - U and T are mutually exclusive
-
Plus Line - Must exist - Optional sequence identifier may be specified after the + (it must exactly the same as the one on the Sequence Identifier in the Sequence Identifier Line).
-
Quality String Line - Sequence and quality should have the same length
Paired reads can be provided as two separate files or in a single file (interleaved). Further information about FASTQ format is available at