Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive RAM usage #89

Open
royfrancis opened this issue Feb 15, 2023 · 6 comments
Open

Excessive RAM usage #89

royfrancis opened this issue Feb 15, 2023 · 6 comments

Comments

@royfrancis
Copy link

royfrancis commented Feb 15, 2023

I have a 25 GB BAM file with about 400 million PE reads coming from the zUMIs pipeline. Single-cell SMART-Seq3 RNA-Seq reads with UMIs. I am running QoRTs QC on this and I am running into out of memory. I tried providing 128GB RAM and then raised it to 256GB and I still get the same error. Is it reasonable that more than 256GB RAM might be needed for a BAM file of this size?

This is my script.

java -Xmx200G -jar /sw/bioinfo/QoRTs/1.3.6/rackham/lib/QoRTs.jar QC \
--genomeFA genome.fa \
--flatgff genes-flat.gff \
--RNA \
--noGzipOutput \
--verbose \
--maxReadLength 125 \
sample.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam \
"final_annot.gtf" \
"sample-qorts"

In the output folder I get these two files: QC.QORTS_RUNNING QC.yX9gr2Yu8Jsk.log

I randomly downsampled this BAM to a 15GB BAM to test and I still get the same error. I am starting to suspect it's not just the number of reads.

Complete run output
Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018)
Starting time: (Thu Feb 09 19:57:13 CET 2023)
INPUT_COMMAND(QC)
  INPUT_ARG(infile)=sample.bam
  INPUT_ARG(gtffile)=/crex/proj/project/nobackup/nbis/data/processed/zumis/03dpf/03dpf.final_annot.gtf
  INPUT_ARG(outdir)=sample-qorts
  INPUT_ARG(genomeFA)=Some(List(/crex/proj/project/nobackup/nbis/data/reference/grcz10-custom/genome.fa))
  INPUT_ARG(flatgfffile)=Some(/crex/proj/project/nobackup/nbis/data/processed/zumis/qorts/genes-flat.gff)
  INPUT_ARG(isRNASeq)=true
  INPUT_ARG(noGzipOutput)=true
  INPUT_ARG(verbose)=true
  INPUT_ARG(maxReadLength)=Some(125)
Created Log File: sample-qorts/QC.ZfEVCwtLEYqQ.log
Starting QC
[Time: 2023-02-09 19:57:13] [Mem usage: [75MB / 2058MB]] [Elapsed Time: 00:00:00.0000]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
Parameter --genomeFA found. Adding reference mismatch testing.
NOTE: Function "overlapMatch" requires function "mismatchEngine". Adding "mismatchEngine" to the active function list...
Running functions: CigarOpDistribution, GCDistribution, GeneCalcs, InsertSize, 
        JunctionCalcs, NVC, QualityScoreDistribution, StrandCheck, 
        chromCounts, cigarLocusCounts, mismatchEngine, overlapMatch, 
        readLengthDistro, referenceMatch, writeBiotypeCounts, 
        writeClippedNVC, writeDESeq, writeDEXSeq, writeGeneBody, 
        writeGeneCounts, writeGenewiseGeneBody, 
        writeJunctionSeqCounts, writeKnownSplices, 
        writeNovelSplices, writeSpliceExon
Checking first 10000 reads. Checking SAM file for formatting errors...
   Stats on the first 10000 reads:
        Num Reads Primary Map:    10000
        Num Reads Paired-ended:   10000
        Num Reads mapped pair:    9989
        Num Pair names found:     5389
        Num Pairs matched:        4600
        Read Seq length:          63 to 118
        Unclipped Read length:    63 to 118
        Final maxReadLength:      125
        maxPhredScore:            37
        minPhredScore:            2
NOTE: Read length is not consistent.
   In the first 10000 reads, read length varies from 63 to 118 (param maxReadLength=125)
Note that using data that is hard-clipped prior to alignment is NOT recommended, because this makes it difficult (or impossible) to determine the sequencer read-cycle of each nucleotide base. This may obfuscate cycle-specific artifacts, trends, or errors, the detection of which is one of the primary purposes of QoRTs!In addition, hard clipping (whether before or after alignment) removes quality score data, and thus quality score metrics may be misleadingly optimistic. A MUCH preferable method of removing undesired sequence is to replace such sequence with N's, which preserves the quality score and the sequencer cycle information.
   Note: Data appears to be paired-ended.
   Sorting Note: Reads are not sorted by name (This is OK).
   Sorting Note: Reads are sorted by position (This is OK).
Done checking first 10000 reads. No major problems detected!
Starting getSRPairIterResorted...
SAMRecord Reader Generated. Read length: 125.
[Time: 2023-02-09 19:57:18] [Mem usage: [720MB / 2595MB]] [Elapsed Time: 00:00:04.0783]
> Init GeneCalcs Utility
> Init InsertSize Utility
> Init NVC utility
> Init CigarOpDistribution Utility
> Init QualityScoreDistribution Utility
> Init GC counts Utility
> Init JunctionCalcs utility
length of knownSpliceMap after instantiation: 256778
length of knownCountMap after instantiation: 256778
> Init StrandCheck Utility
> Init chromCount Utility
> Init qcCigarLocusCounts Utility
> Init OverlapMatch Utility
> Init MinorUtils Utility
QC Utilities Generated!
[Time: 2023-02-09 19:58:42] [Mem usage: [13GB / 15GB]] [Elapsed Time: 00:01:28.0789]
helper_calculateGeneAssignmentMap_strict. Found: 31956 genes in the supplied annotation.
helper_calculateGeneAssignmentMap_strict. Found: 4912 genes with ambiguous segments.
helper_calculateGeneAssignmentMap_strict. Found: 27044 genes after first-pass filtering
making makeGeneIntervalMap for geneBody calculations. Found: 27044 acceptable genes for gene-body analysis.
NOTE: Unsorted Read-PAIR-Buffer Size > 100000 [Mem usage:[8GB / 34GB]]
  Currently searching for read: A01901:60:H37HJDRX2:2:2125:2311:15984 for 83585 iterations.  Searching for read: A01901:60:H37HJDRX2:2:2125:2311:15984 10:1211823-1211904 99
  Current unmatched-pair-buffer status: 33780
    (This is generally not a problem, but if this increases further then OutOfMemoryExceptions
    may occur.
    If memory errors do occur, either increase memory allocation or sort the bam-file by name
    and rerun with the '--nameSorted' option.
    This might also indicate that your dataset contains an unusually large number of
    chimeric read-pairs. Or it could occur simply due to the presence of genomic
    loci with extremly high coverage or complex splicing. It may also indicate a SAM/BAM file that 
    does not adhere to the standard SAM specification.)
..........[1000000 Read-Pairs processed] [Time: 2023-02-09 20:02:42] 
   [GenomeSeqContainer Status: buf:(10:13612000-13887000) n=275, MaxSoFar=895]
..........[2000000 Read-Pairs processed] [Time: 2023-02-09 20:05:44] 
   [GenomeSeqContainer Status: buf:(10:28348000-28581000) n=233, MaxSoFar=895]
NOTE: Unsorted Read-PAIR-Buffer Size > 200000 [Mem usage:[46GB / 57GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:1126:7039:25175 for 196883 iterations.  Searching for read: A01901:60:H37HJDRX2:1:1126:7039:25175 10:29443811-29443858 99
  Current unmatched-pair-buffer status: 9621
..........[3000000 Read-Pairs processed] [Time: 2023-02-09 20:08:54] 
   [GenomeSeqContainer Status: buf:(10:39134000-39393000) n=259, MaxSoFar=926]
........Switching to Chromosome: 11 [2023-02-09 20:11:41] ... 
   Skipping chrom "10" in genome fasta...
    found chrom 11 [2023-02-09 20:11:41]
..[4000000 Read-Pairs processed] [Time: 2023-02-09 20:12:05] 
   [GenomeSeqContainer Status: buf:(11:231000-964000) n=733, MaxSoFar=926]
..........[5000000 Read-Pairs processed] [Time: 2023-02-09 20:15:18] 
   [GenomeSeqContainer Status: buf:(11:10564000-11164000) n=600, MaxSoFar=926]
..........[6000000 Read-Pairs processed] [Time: 2023-02-09 20:18:14] 
   [GenomeSeqContainer Status: buf:(11:24745000-25055000) n=310, MaxSoFar=926]
..........[7000000 Read-Pairs processed] [Time: 2023-02-09 20:21:16] 
   [GenomeSeqContainer Status: buf:(11:38804000-39321000) n=517, MaxSoFar=926]
...
NOTE: Unmatched Read Buffer Size > 100000 [Mem usage:[92GB / 94GB]]
    (This is generally not a problem, but if this increases further then OutOfMemoryExceptions
    may occur.
    If memory errors do occur, either increase memory allocation or sort the bam-file by name
    and rerun with the '--nameSorted' option.
    This might also indicate that your dataset contains an unusually large number of
    chimeric read-pairs. Or it could occur simply due to the presence of genomic
    loci with extremly high coverage. It may also indicate a SAM/BAM file that 
    does not adhere to the standard SAM specification.)
..
NOTE: Unmatched Read Buffer Size > 200000 [Mem usage:[42GB / 94GB]]
NOTE: Unsorted Read-PAIR-Buffer Size > 400000 [Mem usage:[44GB / 94GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:1117:7473:19633 for 345767 iterations.  Searching for read: A01901:60:H37HJDRX2:1:1117:7473:19633 11:44043229-44043346 163
  Current unmatched-pair-buffer status: 47797
.....[8000000 Read-Pairs processed] [Time: 2023-02-09 20:24:33] 
   [GenomeSeqContainer Status: buf:(11:44855000-45093000) n=238, MaxSoFar=926]
Switching to Chromosome: 12 [2023-02-09 20:24:47] ... 
   Skipping chrom "11" in genome fasta...
    found chrom 12 [2023-02-09 20:24:47]
..........[9000000 Read-Pairs processed] [Time: 2023-02-09 20:27:31] 
   [GenomeSeqContainer Status: buf:(12:11409000-11757000) n=348, MaxSoFar=1015]
..........[10000000 Read-Pairs processed] [Time: 2023-02-09 20:30:34] 
   [GenomeSeqContainer Status: buf:(12:26538000-26792000) n=254, MaxSoFar=1015]
..........[11000000 Read-Pairs processed] [Time: 2023-02-09 20:33:29] 
   [GenomeSeqContainer Status: buf:(12:39494000-39724000) n=230, MaxSoFar=1015]
.....Switching to Chromosome: 13 [2023-02-09 20:35:04] ... 
   Skipping chrom "12" in genome fasta...
    found chrom 13 [2023-02-09 20:35:04]
.....[12000000 Read-Pairs processed] [Time: 2023-02-09 20:36:21] 
   [GenomeSeqContainer Status: buf:(13:4653000-5074000) n=421, MaxSoFar=1015]
..........[13000000 Read-Pairs processed] [Time: 2023-02-09 20:39:16] 
   [GenomeSeqContainer Status: buf:(13:22597000-23054000) n=457, MaxSoFar=1015]
..........[14000000 Read-Pairs processed] [Time: 2023-02-09 20:42:13] 
   [GenomeSeqContainer Status: buf:(13:36358000-36611000) n=253, MaxSoFar=1015]
........Switching to Chromosome: 14 [2023-02-09 20:44:38] ... 
   Skipping chrom "13" in genome fasta...
    found chrom 14 [2023-02-09 20:44:38]
..[15000000 Read-Pairs processed] [Time: 2023-02-09 20:45:09] 
..........[16000000 Read-Pairs processed] [Time: 2023-02-09 20:48:20] 
   [GenomeSeqContainer Status: buf:(14:6824000-7299000) n=475, MaxSoFar=1041]
..........[17000000 Read-Pairs processed] [Time: 2023-02-09 20:51:15] 
   [GenomeSeqContainer Status: buf:(14:25170000-25389000) n=219, MaxSoFar=1041]
..........[18000000 Read-Pairs processed] [Time: 2023-02-09 20:54:28] 
   [GenomeSeqContainer Status: buf:(14:32531000-33088000) n=557, MaxSoFar=1041]
..........[19000000 Read-Pairs processed] [Time: 2023-02-09 20:57:29] 
   [GenomeSeqContainer Status: buf:(14:37242000-37498000) n=256, MaxSoFar=1041]
....
NOTE: Unmatched Read Buffer Size > 400000 [Mem usage:[81GB / 102GB]]
NOTE: Unsorted Read-PAIR-Buffer Size > 800000 [Mem usage:[85GB / 102GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 for 691700 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 14:46035424-46647919 163
  Current unmatched-pair-buffer status: 578759
NOTE: Unsorted Read-PAIR-Buffer Size > 1600000 [Mem usage:[91GB / 102GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 for 80666 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 14:46090552-46651262 163
  Current unmatched-pair-buffer status: 430200
NOTE: Unsorted Read-PAIR-Buffer Size > 3200000 [Mem usage:[22GB / 117GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 for 1680666 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 14:46090552-46651262 163
  Current unmatched-pair-buffer status: 599192
......[20000000 Read-Pairs processed] [Time: 2023-02-09 21:01:33] 
   [GenomeSeqContainer Status: buf:(14:46637000-47216000) n=579, MaxSoFar=1143]
..........[21000000 Read-Pairs processed] [Time: 2023-02-09 21:04:33] 
   [GenomeSeqContainer Status: buf:(14:46638000-47216000) n=578, MaxSoFar=1143]
..........[22000000 Read-Pairs processed] [Time: 2023-02-09 21:07:37] 
   [GenomeSeqContainer Status: buf:(14:46640000-47216000) n=576, MaxSoFar=1143]
..........[23000000 Read-Pairs processed] [Time: 2023-02-09 21:10:32] 
   [GenomeSeqContainer Status: buf:(14:46641000-47731000) n=1090, MaxSoFar=1143]
.........Switching to Chromosome: 15 [2023-02-09 21:13:41] ... 
   Skipping chrom "14" in genome fasta...
    found chrom 15 [2023-02-09 21:13:41]
.[24000000 Read-Pairs processed] [Time: 2023-02-09 21:13:42] 
   [GenomeSeqContainer Status: buf:(15:6000-558000) n=552, MaxSoFar=1143]
..........[25000000 Read-Pairs processed] [Time: 2023-02-09 21:16:37] 
   [GenomeSeqContainer Status: buf:(15:20466000-20553000) n=87, MaxSoFar=1143]
..........[26000000 Read-Pairs processed] [Time: 2023-02-09 21:19:36] 
   [GenomeSeqContainer Status: buf:(15:32485000-32761000) n=276, MaxSoFar=1143]
..........[27000000 Read-Pairs processed] [Time: 2023-02-09 21:22:39] 
   [GenomeSeqContainer Status: buf:(15:46359000-47022000) n=663, MaxSoFar=1280]
.Switching to Chromosome: 16 [2023-02-09 21:23:10] ... 
   Skipping chrom "15" in genome fasta...
    found chrom 16 [2023-02-09 21:23:10]
.........[28000000 Read-Pairs processed] [Time: 2023-02-09 21:25:52] 
   [GenomeSeqContainer Status: buf:(16:6859000-7471000) n=612, MaxSoFar=1280]
..........[29000000 Read-Pairs processed] [Time: 2023-02-09 21:28:53] 
   [GenomeSeqContainer Status: buf:(16:16248000-16569000) n=321, MaxSoFar=1280]
..........[30000000 Read-Pairs processed] [Time: 2023-02-09 21:32:43] 
   [GenomeSeqContainer Status: buf:(16:24269000-24630000) n=361, MaxSoFar=1280]
..........[31000000 Read-Pairs processed] [Time: 2023-02-09 21:36:00] 
   [GenomeSeqContainer Status: buf:(16:31957000-32208000) n=251, MaxSoFar=1280]
..........[32000000 Read-Pairs processed] [Time: 2023-02-09 21:39:05] 
   [GenomeSeqContainer Status: buf:(16:43314000-43783000) n=469, MaxSoFar=1280]
..........[33000000 Read-Pairs processed] [Time: 2023-02-09 21:42:26] 
   [GenomeSeqContainer Status: buf:(16:52018000-52525000) n=507, MaxSoFar=1280]
..Switching to Chromosome: 17 [2023-02-09 21:43:21] ... 
   Skipping chrom "16" in genome fasta...
    found chrom 17 [2023-02-09 21:43:21]
........[34000000 Read-Pairs processed] [Time: 2023-02-09 21:45:50] 
   [GenomeSeqContainer Status: buf:(17:850000-1328000) n=478, MaxSoFar=1280]
..........[35000000 Read-Pairs processed] [Time: 2023-02-09 21:49:14] 
   [GenomeSeqContainer Status: buf:(17:18859000-19099000) n=240, MaxSoFar=1280]
..........[36000000 Read-Pairs processed] [Time: 2023-02-09 21:53:36] 
   [GenomeSeqContainer Status: buf:(17:32562000-32832000) n=270, MaxSoFar=1280]
.........Switching to Chromosome: 18 [2023-02-09 21:57:41] ... 
   Skipping chrom "17" in genome fasta...
    found chrom 18 [2023-02-09 21:57:41]
.[37000000 Read-Pairs processed] [Time: 2023-02-09 21:57:57] 
   [GenomeSeqContainer Status: buf:(18:1008000-1307000) n=299, MaxSoFar=1280]
..........[38000000 Read-Pairs processed] [Time: 2023-02-09 22:02:43] 
   [GenomeSeqContainer Status: buf:(18:10841000-11373000) n=532, MaxSoFar=1280]
..........[39000000 Read-Pairs processed] [Time: 2023-02-09 22:05:55] 
   [GenomeSeqContainer Status: buf:(18:24998000-25398000) n=400, MaxSoFar=1280]
..........[40000000 Read-Pairs processed] [Time: 2023-02-09 22:09:04] 
   [GenomeSeqContainer Status: buf:(18:44829000-45135000) n=306, MaxSoFar=1280]
.....Switching to Chromosome: 19 [2023-02-09 22:10:44] ... 
   Skipping chrom "18" in genome fasta...
    found chrom 19 [2023-02-09 22:10:44]
.....[41000000 Read-Pairs processed] [Time: 2023-02-09 22:13:07] 
   [GenomeSeqContainer Status: buf:(19:1913000-2661000) n=748, MaxSoFar=1280]
..........[42000000 Read-Pairs processed] [Time: 2023-02-09 22:17:17] 
   [GenomeSeqContainer Status: buf:(19:11309000-11566000) n=257, MaxSoFar=1280]
..........[43000000 Read-Pairs processed] [Time: 2023-02-09 22:21:43] 
   [GenomeSeqContainer Status: buf:(19:21194000-21517000) n=323, MaxSoFar=1280]
..........[44000000 Read-Pairs processed] [Time: 2023-02-09 22:25:46] 
   [GenomeSeqContainer Status: buf:(19:22228000-22643000) n=415, MaxSoFar=1280]
..........[45000000 Read-Pairs processed] [Time: 2023-02-09 22:29:08] 
   [GenomeSeqContainer Status: buf:(19:28687000-28946000) n=259, MaxSoFar=1280]
..........[46000000 Read-Pairs processed] [Time: 2023-02-09 22:35:04] 
   [GenomeSeqContainer Status: buf:(19:35416000-35614000) n=198, MaxSoFar=1280]
..........[47000000 Read-Pairs processed] [Time: 2023-02-09 22:40:30] 
   [GenomeSeqContainer Status: buf:(19:43514000-43858000) n=344, MaxSoFar=1280]
..........[48000000 Read-Pairs processed] [Time: 2023-02-09 22:44:22] 
   [GenomeSeqContainer Status: buf:(19:48443000-48779000) n=336, MaxSoFar=1280]
....Switching to Chromosome: 1 [2023-02-09 22:46:33] ... 
   Skipping chrom "19" in genome fasta...
    found chrom 1 [2023-02-09 22:46:33]
......[49000000 Read-Pairs processed] [Time: 2023-02-09 22:48:47] 
   [GenomeSeqContainer Status: buf:(1:1578000-2061000) n=483, MaxSoFar=1280]
..........[50000000 Read-Pairs processed] [Time: 2023-02-09 22:52:21] 
   [GenomeSeqContainer Status: buf:(1:11580000-11845000) n=265, MaxSoFar=1280]
..........[51000000 Read-Pairs processed] [Time: 2023-02-09 22:55:56] 
   [GenomeSeqContainer Status: buf:(1:27913000-28146000) n=233, MaxSoFar=1280]
..........[52000000 Read-Pairs processed] [Time: 2023-02-09 23:00:43] 
   [GenomeSeqContainer Status: buf:(1:41077000-41321000) n=244, MaxSoFar=1280]
..........[53000000 Read-Pairs processed] [Time: 2023-02-09 23:08:05] 
   [GenomeSeqContainer Status: buf:(1:51077000-51496000) n=419, MaxSoFar=1280]
.....Switching to Chromosome: 20 [2023-02-09 23:10:02] ... 
   Skipping chrom "1" in genome fasta...
    found chrom 20 [2023-02-09 23:10:02]
.....[54000000 Read-Pairs processed] [Time: 2023-02-09 23:11:31] 
   [GenomeSeqContainer Status: buf:(20:7071000-7402000) n=331, MaxSoFar=1280]
..........[55000000 Read-Pairs processed] [Time: 2023-02-09 23:14:51] 
   [GenomeSeqContainer Status: buf:(20:20991000-21880000) n=889, MaxSoFar=1280]
..........[56000000 Read-Pairs processed] [Time: 2023-02-09 23:18:56] 
   [GenomeSeqContainer Status: buf:(20:33938000-34125000) n=187, MaxSoFar=1280]
..........[57000000 Read-Pairs processed] [Time: 2023-02-09 23:22:42] 
   [GenomeSeqContainer Status: buf:(20:46673000-46963000) n=290, MaxSoFar=1280]
..........[58000000 Read-Pairs processed] [Time: 2023-02-09 23:26:09] 
.Switching to Chromosome: 21 [2023-02-09 23:26:11] ... 
   Skipping chrom "20" in genome fasta...
    found chrom 21 [2023-02-09 23:26:11]
.........[59000000 Read-Pairs processed] [Time: 2023-02-09 23:29:13] 
   [GenomeSeqContainer Status: buf:(21:6140000-6676000) n=536, MaxSoFar=1465]
..........[60000000 Read-Pairs processed] [Time: 2023-02-09 23:41:17] 
   [GenomeSeqContainer Status: buf:(21:22056000-22484000) n=428, MaxSoFar=1465]
..........[61000000 Read-Pairs processed] [Time: 2023-02-10 00:13:32] 
   [GenomeSeqContainer Status: buf:(21:32723000-32980000) n=257, MaxSoFar=1465]
..........[62000000 Read-Pairs processed] [Time: 2023-02-10 00:20:03] 
   [GenomeSeqContainer Status: buf:(21:44554000-45107000) n=553, MaxSoFar=1465]
..Switching to Chromosome: 22 [2023-02-10 00:21:07] ... 
   Skipping chrom "21" in genome fasta...
    found chrom 22 [2023-02-10 00:21:07]
........[63000000 Read-Pairs processed] [Time: 2023-02-10 00:25:31] 
   [GenomeSeqContainer Status: buf:(22:3726000-4121000) n=395, MaxSoFar=1465]
..........[64000000 Read-Pairs processed] [Time: 2023-02-10 00:28:44] 
   [GenomeSeqContainer Status: buf:(22:18508000-18911000) n=403, MaxSoFar=1465]
..........[65000000 Read-Pairs processed] [Time: 2023-02-10 00:35:06] 
   [GenomeSeqContainer Status: buf:(22:31536000-32126000) n=590, MaxSoFar=1465]
....Switching to Chromosome: 23 [2023-02-10 00:38:06] ... 
   Skipping chrom "22" in genome fasta...
    found chrom 23 [2023-02-10 00:38:06]
......[66000000 Read-Pairs processed] [Time: 2023-02-10 00:40:24] 
   [GenomeSeqContainer Status: buf:(23:9275000-9557000) n=282, MaxSoFar=1465]
..........[67000000 Read-Pairs processed] [Time: 2023-02-10 00:45:00] 
   [GenomeSeqContainer Status: buf:(23:19733000-19970000) n=237, MaxSoFar=1465]
..........[68000000 Read-Pairs processed] [Time: 2023-02-10 00:51:09] 
   [GenomeSeqContainer Status: buf:(23:25368000-25610000) n=242, MaxSoFar=1465]
..........[69000000 Read-Pairs processed] [Time: 2023-02-10 00:56:20] 
   [GenomeSeqContainer Status: buf:(23:31496000-31716000) n=220, MaxSoFar=1465]
..........[70000000 Read-Pairs processed] [Time: 2023-02-10 01:00:19] 
   [GenomeSeqContainer Status: buf:(23:36199000-36408000) n=209, MaxSoFar=1465]
.......Switching to Chromosome: 24 [2023-02-10 01:03:40] ... 
   Skipping chrom "23" in genome fasta...
    found chrom 24 [2023-02-10 01:03:40]
...[71000000 Read-Pairs processed] [Time: 2023-02-10 01:05:28] 
   [GenomeSeqContainer Status: buf:(24:6960000-7562000) n=602, MaxSoFar=1465]
..........[72000000 Read-Pairs processed] [Time: 2023-02-10 01:12:54] 
   [GenomeSeqContainer Status: buf:(24:21467000-21727000) n=260, MaxSoFar=1465]
..........[73000000 Read-Pairs processed] [Time: 2023-02-10 01:16:23] 
   [GenomeSeqContainer Status: buf:(24:37443000-37727000) n=284, MaxSoFar=1472]
....Switching to Chromosome: 25 [2023-02-10 01:17:50] ... 
   Skipping chrom "24" in genome fasta...
    found chrom 25 [2023-02-10 01:17:50]
......[74000000 Read-Pairs processed] [Time: 2023-02-10 01:19:49] 
   [GenomeSeqContainer Status: buf:(25:4439000-4668000) n=229, MaxSoFar=1472]
..........[75000000 Read-Pairs processed] [Time: 2023-02-10 01:29:12] 
   [GenomeSeqContainer Status: buf:(25:19265000-19581000) n=316, MaxSoFar=1472]
..........[76000000 Read-Pairs processed] [Time: 2023-02-10 01:36:20] 
   [GenomeSeqContainer Status: buf:(25:36866000-36898000) n=32, MaxSoFar=1472]
Switching to Chromosome: 2 [2023-02-10 01:36:31] ... 
   Skipping chrom "25" in genome fasta...
    found chrom 2 [2023-02-10 01:36:31]
..........[77000000 Read-Pairs processed] [Time: 2023-02-10 01:39:44] 
   [GenomeSeqContainer Status: buf:(2:11120000-11555000) n=435, MaxSoFar=1472]
..........[78000000 Read-Pairs processed] [Time: 2023-02-10 01:43:07] 
   [GenomeSeqContainer Status: buf:(2:26584000-26965000) n=381, MaxSoFar=1472]
..........[79000000 Read-Pairs processed] [Time: 2023-02-10 01:49:42] 
   [GenomeSeqContainer Status: buf:(2:35398000-35869000) n=471, MaxSoFar=1472]
..........[80000000 Read-Pairs processed] [Time: 2023-02-10 01:54:32] 
   [GenomeSeqContainer Status: buf:(2:45793000-46188000) n=395, MaxSoFar=1472]
..........[81000000 Read-Pairs processed] [Time: 2023-02-10 01:59:51] 
   [GenomeSeqContainer Status: buf:(2:58651000-59172000) n=521, MaxSoFar=1472]
.Switching to Chromosome: 3 [2023-02-10 02:00:11] ... 
   Skipping chrom "2" in genome fasta...
    found chrom 3 [2023-02-10 02:00:11]
.........[82000000 Read-Pairs processed] [Time: 2023-02-10 02:06:42] 
   [GenomeSeqContainer Status: buf:(3:15067000-15516000) n=449, MaxSoFar=1865]
..........[83000000 Read-Pairs processed] [Time: 2023-02-10 02:13:11] 
   [GenomeSeqContainer Status: buf:(3:18241000-18493000) n=252, MaxSoFar=1865]
..........[84000000 Read-Pairs processed] [Time: 2023-02-10 02:18:24] 
   [GenomeSeqContainer Status: buf:(3:23547000-23843000) n=296, MaxSoFar=1865]
..........[85000000 Read-Pairs processed] [Time: 2023-02-10 02:24:00] 
   [GenomeSeqContainer Status: buf:(3:29727000-29936000) n=209, MaxSoFar=1865]
..........[86000000 Read-Pairs processed] [Time: 2023-02-10 02:27:26] 
   [GenomeSeqContainer Status: buf:(3:32315000-32519000) n=204, MaxSoFar=1865]
..........[87000000 Read-Pairs processed] [Time: 2023-02-10 02:30:50] 
   [GenomeSeqContainer Status: buf:(3:39424000-39692000) n=268, MaxSoFar=1865]
..........[88000000 Read-Pairs processed] [Time: 2023-02-10 02:35:53] 
   [GenomeSeqContainer Status: buf:(3:41774000-41907000) n=133, MaxSoFar=1865]
..........[89000000 Read-Pairs processed] [Time: 2023-02-10 02:39:56] 
   [GenomeSeqContainer Status: buf:(3:54636000-55198000) n=562, MaxSoFar=1865]
......Switching to Chromosome: 4 [2023-02-10 02:41:45] ... 
   Skipping chrom "3" in genome fasta...
    found chrom 4 [2023-02-10 02:41:45]
....[90000000 Read-Pairs processed] [Time: 2023-02-10 02:43:11] 
   [GenomeSeqContainer Status: buf:(4:978000-1375000) n=397, MaxSoFar=1865]
..........[91000000 Read-Pairs processed] [Time: 2023-02-10 02:47:18] 
   [GenomeSeqContainer Status: buf:(4:14886000-15084000) n=198, MaxSoFar=1865]
..........[92000000 Read-Pairs processed] [Time: 2023-02-10 02:54:33] 
   [GenomeSeqContainer Status: buf:(4:25830000-26034000) n=204, MaxSoFar=1865]
.......Switching to Chromosome: 5 [2023-02-10 02:58:32] ... 
   Skipping chrom "4" in genome fasta...
    found chrom 5 [2023-02-10 02:58:32]
.
NOTE: Unmatched Read Buffer Size > 800000 [Mem usage:[63GB / 201GB]]
NOTE: Unmatched Read Buffer Size > 1600000 [Mem usage:[70GB / 201GB]]
..[93000000 Read-Pairs processed] [Time: 2023-02-10 03:01:19] 
..........[94000000 Read-Pairs processed] [Time: 2023-02-10 03:04:08] 
   [GenomeSeqContainer Status: buf:(5:817000-1364000) n=547, MaxSoFar=3118]
..........[95000000 Read-Pairs processed] [Time: 2023-02-10 03:25:41] 
   [GenomeSeqContainer Status: buf:(5:817000-1396000) n=579, MaxSoFar=3118]
..........[96000000 Read-Pairs processed] [Time: 2023-02-10 03:29:06] 
   [GenomeSeqContainer Status: buf:(5:817000-1574000) n=757, MaxSoFar=3118]
..........[97000000 Read-Pairs processed] [Time: 2023-02-10 03:33:27] 
   [GenomeSeqContainer Status: buf:(5:4031000-4460000) n=429, MaxSoFar=3118]
..........[98000000 Read-Pairs processed] [Time: 2023-02-10 03:37:52] 
   [GenomeSeqContainer Status: buf:(5:21996000-22198000) n=202, MaxSoFar=3118]
..........[99000000 Read-Pairs processed] [Time: 2023-02-10 03:41:53] 
   [GenomeSeqContainer Status: buf:(5:22763000-23359000) n=596, MaxSoFar=3118]
..........[100000000 Read-Pairs processed] [Time: 2023-02-10 03:45:16] 
   [GenomeSeqContainer Status: buf:(5:28989000-29478000) n=489, MaxSoFar=3118]
..........[101000000 Read-Pairs processed] [Time: 2023-02-10 03:51:01] 
   [GenomeSeqContainer Status: buf:(5:34381000-34562000) n=181, MaxSoFar=3118]
..........[102000000 Read-Pairs processed] [Time: 2023-02-10 03:54:57] 
   [GenomeSeqContainer Status: buf:(5:43072000-43655000) n=583, MaxSoFar=3118]
..........[103000000 Read-Pairs processed] [Time: 2023-02-10 04:00:01] 
   [GenomeSeqContainer Status: buf:(5:58171000-58420000) n=249, MaxSoFar=3118]
.......Switching to Chromosome: 6 [2023-02-10 04:04:12] ... 
   Skipping chrom "5" in genome fasta...
    found chrom 6 [2023-02-10 04:04:12]
...[104000000 Read-Pairs processed] [Time: 2023-02-10 04:05:08] 
   [GenomeSeqContainer Status: buf:(6:4735000-5174000) n=439, MaxSoFar=3118]
..........[105000000 Read-Pairs processed] [Time: 2023-02-10 04:08:28] 
   [GenomeSeqContainer Status: buf:(6:9770000-10252000) n=482, MaxSoFar=3118]
..........[106000000 Read-Pairs processed] [Time: 2023-02-10 04:12:48] 
   [GenomeSeqContainer Status: buf:(6:21883000-22136000) n=253, MaxSoFar=3118]
..........[107000000 Read-Pairs processed] [Time: 2023-02-10 04:20:59] 
   [GenomeSeqContainer Status: buf:(6:37348000-37584000) n=236, MaxSoFar=3118]
..........[108000000 Read-Pairs processed] [Time: 2023-02-10 04:28:39] 
   [GenomeSeqContainer Status: buf:(6:49515000-50650000) n=1135, MaxSoFar=3118]
........Switching to Chromosome: 7 [2023-02-10 04:36:25] ... 
   Skipping chrom "6" in genome fasta...
    found chrom 7 [2023-02-10 04:36:25]
..[109000000 Read-Pairs processed] [Time: 2023-02-10 04:36:55] 
   [GenomeSeqContainer Status: buf:(7:3976000-5047000) n=1071, MaxSoFar=3118]
..........[110000000 Read-Pairs processed] [Time: 2023-02-10 04:40:18] 
   [GenomeSeqContainer Status: buf:(7:21605000-21862000) n=257, MaxSoFar=3118]
..........[111000000 Read-Pairs processed] [Time: 2023-02-10 04:43:45] 
   [GenomeSeqContainer Status: buf:(7:29671000-30221000) n=550, MaxSoFar=3118]
..........[112000000 Read-Pairs processed] [Time: 2023-02-10 04:52:26] 
   [GenomeSeqContainer Status: buf:(7:38415000-38802000) n=387, MaxSoFar=3118]
..........[113000000 Read-Pairs processed] [Time: 2023-02-10 04:57:05] 
   [GenomeSeqContainer Status: buf:(7:41515000-41721000) n=206, MaxSoFar=3118]
..........[114000000 Read-Pairs processed] [Time: 2023-02-10 05:07:00] 
   [GenomeSeqContainer Status: buf:(7:54016000-54330000) n=314, MaxSoFar=3118]
..........[115000000 Read-Pairs processed] [Time: 2023-02-10 05:14:26] 
   [GenomeSeqContainer Status: buf:(7:64551000-65108000) n=557, MaxSoFar=3118]
......Switching to Chromosome: 8 [2023-02-10 05:20:02] ... 
   Skipping chrom "7" in genome fasta...
    found chrom 8 [2023-02-10 05:20:02]
....[116000000 Read-Pairs processed] [Time: 2023-02-10 05:22:24] 
   [GenomeSeqContainer Status: buf:(8:2433000-2891000) n=458, MaxSoFar=3118]
..........[117000000 Read-Pairs processed] [Time: 2023-02-10 05:39:28] 
   [GenomeSeqContainer Status: buf:(8:21053000-21221000) n=168, MaxSoFar=3118]
..........[118000000 Read-Pairs processed] [Time: 2023-02-10 05:46:48] 
   [GenomeSeqContainer Status: buf:(8:31349000-31963000) n=614, MaxSoFar=3118]
..........[119000000 Read-Pairs processed] [Time: 2023-02-10 05:52:30] 
   [GenomeSeqContainer Status: buf:(8:48975000-49194000) n=219, MaxSoFar=3118]
......Switching to Chromosome: 9 [2023-02-10 05:54:29] ... 
   Skipping chrom "8" in genome fasta...
    found chrom 9 [2023-02-10 05:54:29]
....[120000000 Read-Pairs processed] [Time: 2023-02-10 05:55:49] 
   [GenomeSeqContainer Status: buf:(9:6384000-6712000) n=328, MaxSoFar=3118]
..........[121000000 Read-Pairs processed] [Time: 2023-02-10 05:59:18] 
   [GenomeSeqContainer Status: buf:(9:18211000-18333000) n=122, MaxSoFar=3118]
..........[122000000 Read-Pairs processed] [Time: 2023-02-10 06:07:33] 
   [GenomeSeqContainer Status: buf:(9:33504000-33753000) n=249, MaxSoFar=3118]
..........[123000000 Read-Pairs processed] [Time: 2023-02-10 06:16:16] 
   [GenomeSeqContainer Status: buf:(9:48878000-49143000) n=265, MaxSoFar=3118]
.......Switching to Chromosome: MT [2023-02-10 06:21:26] ... 
   Skipping chrom "9" in genome fasta...
    found chrom MT [2023-02-10 06:21:26]
NOTE: Unmatched Read Buffer Size > 3200000 [Mem usage:[79GB / 195GB]]
NOTE: Unmatched Read Buffer Size > 6400000 [Mem usage:[93GB / 195GB]]
NOTE: Unmatched Read Buffer Size > 12800000 [Mem usage:[117GB / 195GB]]
NOTE: Unmatched Read Buffer Size > 25600000 [Mem usage:[169GB / 195GB]]
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00002b6810b00000, 524288, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 524288 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /crex/proj/project/nobackup/nbis/data/processed/zumis/qorts/hs_err_pid1748.log
BAM preview
A01901:60:H37HJDRX2:1:1273:24858:4899	163	10	729	3	17M95932N101M	=	152190	151507	CACACACACACACAGAGACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACAGACACACA	FFFFFFFFFFFFFF:FFFFFFFFFF:FFFFF:F,F,F,FFFFFFFFF:FFFFF,FFFFFFFFFFFF,F::FFFFFFFFFF,FFFFFF:FF,F:F,:,,FF,FFFF,F,:F,:FFF:F:	NH:i:2	HI:i:1	AS:i:156	nM:i:2	BX:Z:TGTATCCGAACCATGTTGCA	BC:Z:TGTATCCGAACCATGTTGCA	QB:Z:FFFFFFFFFF:FFFFFFFFF	QU:Z:FFFFFFFF	ES:Z:Unassigned_NoFeatures	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000086075	UX:Z:GCAGAACC	UB:Z:GCAGAACC
A01901:60:H37HJDRX2:2:2241:28673:9251	163	10	733	255	13M94039N105M	=	190529	189832	CACACACACAGAGACACACGCACGCACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACGCAGGCACGCACACACAAAATCAGACA	FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,FFFFFFFFFFFFFFFFFF,F,,FFFFFF,,F,,FF,,,,,,,:F,,:F:F,:,:F,,,:	NH:i:1	HI:i:1	AS:i:134	nM:i:8	BX:Z:TTCGTTGTACTTCACCTGTG	BC:Z:TTCGTTGTACTTCACCTGTG	QB:Z:FFFFFFFFFFFFFFFFFFFF	QU:Z:	ES:Z:Assigned3	EN:i:1	GE:Z:ENSDARG00000103980	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000086075	UX:Z:	UB:Z:
A01901:60:H37HJDRX2:2:1101:30047:9721	163	10	735	3	11M95900N107M	=	152192	151501	CACACACAGAGACACACACACACACACACACACACACACACACACACACACACACACACAGACACACACACACACACACACACACACACACACACACACACACACACACACACACACA	FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFF,F,FFFFFFFFFFF,FFFF:FFFF,FFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFF	NH:i:2	HI:i:1	AS:i:156	nM:i:1	BX:Z:TGTATCCGAACCATGTTGCA	BC:Z:TGTATCCGAACCATGTTGCA	QB:Z:FFFFFFFFFFFFFFFFFFFF	QU:Z:FFFFFFFF	ES:Z:Unassigned_NoFeatures	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000086075	UX:Z:GCAGAACC	UB:Z:GCAGAACC
A01901:60:H37HJDRX2:2:2259:16459:10864	163	10	735	3	11M95896N107M	=	152188	151501	CACACACAGAGACACACACACACACACACACACACACACACACACACACACACACACACACACAGACACACACACACACACACACACACACACACACACACACACACACACACACACA	FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F,F::FFFFFFFF,F:FFFFFFF:F,FFFFFFFFFFFFFFFF:,FFFFFFF:FFF	NH:i:2	HI:i:1	AS:i:160	nM:i:1	BX:Z:TGTATCCGAACCATGTTGCA	BC:Z:TGTATCCGAACCATGTTGCA	QB:Z:FFFFFFFFFFFFFFFFFFFF	QU:Z:FFFFFFFF	ES:Z:Unassigned_NoFeatures	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000086075	UX:Z:GCAGAACC	UB:Z:GCAGAACC
A01901:60:H37HJDRX2:2:1115:18511:3583	163	10	863	255	76M42S	=	82408	81776	CTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGGGTCATCTGGCGGTGTGTGTTCTGAGTTGTCTGCAGCGCAGCAGG	FFFFF,FFF:F,FFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFF,,:,F,,:,,F,:FF:FF:,F:FF,F,::FFF,,,,,,F,,FF,,	NH:i:1	HI:i:1	AS:i:155	nM:i:1	BX:Z:GGTCGTGATTTTGGTCAGTT	BC:Z:GGTCGTGATTTTGGTCAGTT	QB:Z:FF:FF:FFFF,FFFFFFFFF	QU:Z:	ES:Z:Assigned3	EN:i:1	GE:Z:ENSDARG00000086075	IS:Z:Unassigned_NoFeatures	UX:Z:	UB:Z:
A01901:60:H37HJDRX2:2:2271:22001:14857	99	10	863	255	17S68M	=	82624	82022	CTGTCACAGTGGTGTCACTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT	FFFFFFF:FFFF:FFFF:FFFF,F,FFF,FFF,:F:,FFFFFFFFF,F:FFFFFFFFFFF:F:F:F,FFFFF,FFF,F:FFFF:,	NH:i:1	HI:i:1	AS:i:184	nM:i:0	BX:Z:GAGCGCCTATTACGTAATCG	BC:Z:GAGCGCCTATTACGTAATCG	QB:Z:FFFFFFFFFFF,FFFFFFFF	QU:Z:	ES:Z:Assigned3	EN:i:1	GE:Z:ENSDARG00000086075	IS:Z:Unassigned_NoFeatures	UX:Z:	UB:Z:
A01901:60:H37HJDRX2:1:2106:9082:32346	99	10	887	255	1S84M	=	82308	81687	GGTGTCACTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGA	FFFFFFFFFFFFFFFF:F,FFFFFF:FFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFFFFFFFFF,,,,	NH:i:1	HI:i:1	AS:i:192	nM:i:4	BX:Z:GGTCGTGATTTTGGTCAGTT	BC:Z:GGTCGTGATTTTGGTCAGTT	QB:Z:FFFFF,F:FF:FFFFF,FFF	QU:Z:	ES:Z:Assigned3	EN:i:1	GE:Z:ENSDARG00000086075	IS:Z:Unassigned_NoFeatures	UX:Z:	UB:Z:
A01901:60:H37HJDRX2:1:2208:11731:15515	99	10	887	255	85M	=	82845	82226	GTGTCACTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGAG	FFFFFF,FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFF:FFFFF:,,F,F,	NH:i:1	HI:i:1	AS:i:186	nM:i:7	BX:Z:GGTCGTGATTTTGGTCAGTT	BC:Z:GGTCGTGATTTTGGTCAGTT	QB:Z:FFFFF:F::FFFFFFF,FFF	QU:Z:	ES:Z:Assigned3	EN:i:1	GE:Z:ENSDARG00000086075	IS:Z:Unassigned_NoFeatures	UX:Z:	UB:Z:
A01901:60:H37HJDRX2:2:2234:23140:32142	163	10	893	3	79M333463N23M16S	=	391754	390923	GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGAGTGTCTCACACACACACACAGAAAAAAATCTCTCAAAAAA	FFFFFFF:FFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,FFFF,FFFF:FFFFF:F:,F:F,,:,,:,::,,,,:F,F,F,,,:,F,FFF,,,,,,,F,:F,	NH:i:2	HI:i:1	AS:i:151	nM:i:0	BX:Z:CTATAACCGTTTGGTTCCAA	BC:Z:CTATAACCGTTTGGTTCCAA	QB:Z:FFFFFFFFFFFFFFFFFFFF	QU:Z:FFFFFFFF	ES:Z:Unassigned_NoFeatures	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000087585	UX:Z:AGGGAGGC	UB:Z:AGGGAGGC
A01901:60:H37HJDRX2:1:1250:10050:26882	99	10	913	255	63M	=	217626	216786	GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGAGTGAT	F,FFFFF:FFFFFFFFF:FFFFFFFFF,FFFFF,FFFFFFFFFFFFF:F:FF,F::F,F,:,,	NH:i:1	HI:i:1	AS:i:124	nM:i:4	BX:Z:CCACTTCCATCCGTTAACAA	BC:Z:CCACTTCCATCCGTTAACAA	QB:Z:FFFF:FFFFFFFFFFFFFFF	QU:Z:FFFFFFFF	ES:Z:Unassigned_NoFeatures	IS:Z:Assigned3	IN:i:1	GI:Z:ENSDARG00000059048	UX:Z:GTTGGCTG	UB:Z:GTTGGCTG
@hartleys
Copy link
Owner

This happens when you have either (a) a large number of read-pairs that are extremely long distances apart, or (b) EXTREMELY high read density. Basically, as it parses the BAM file it keeps a running buffer of the read pairs that have not yet been matched. This works fine up until you have a huge number of reads in between any one read and its mate.

It looks like it does Ok right up to the very end where it suddenly can't find any matches and just keeps loading more reads. What does the end of your BAM file look like? Do you have unmapped reads there or a huge number of reads that map to loose contigs or something? Do you have enormous numbers of reads on the MT chromosome? Are you studying a tissue with a ton of mitochondrial expression maybe?

I'm not terribly surprised that downsampling causes the same issue. If you randomly downsample without making sure to keep paired reads matched up, then basically all your reads become pairless and QoRTs will try to read the entire uncompressed file into memory trying to find the missing reads.

What happens if you feed it one complete chromosome? So like:

samtools view -h sample.bam 1 > sample.chr1.bam

Or maybe even handing it everything except MT?

@royfrancis
Copy link
Author

royfrancis commented Feb 17, 2023

Thank you for you reply and insights. I don't really know much about this BAM, where reads are mapping to etc... which is why I am running QC on it :D

The randomly downsampled BAM with 125M read pairs finally did manage to complete when given 512GB of RAM o_O

snowy-snic2022-22-328-royfranc-7319726-1

Resource usage during the run.

Complete run output
Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018)
Starting time: (Thu Feb 16 10:58:06 CET 2023)
INPUT_COMMAND(QC)
  INPUT_ARG(infile)=sample-sub.bam
  INPUT_ARG(gtffile)=/crex/proj/project/nobackup/nbis/data/processed/zumis/03dpf/03dpf.final_annot.gtf
  INPUT_ARG(outdir)=sample-sub-qorts
  INPUT_ARG(genomeFA)=Some(List(/crex/proj/project/nobackup/nbis/data/reference/grcz10-custom/genome.fa))
  INPUT_ARG(flatgfffile)=Some(/crex/proj/project/nobackup/nbis/data/processed/zumis/qorts/genes-flat.gff)
  INPUT_ARG(isRNASeq)=true
  INPUT_ARG(noGzipOutput)=true
  INPUT_ARG(verbose)=true
  INPUT_ARG(maxReadLength)=Some(125)
Created Log File: sample-sub-qorts/QC.FTnRrt5rbVMr.log
Warning: run-in-progress file "sample-sub-qorts/QC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
Starting QC
[Time: 2023-02-16 10:58:06] [Mem usage: [75MB / 2058MB]] [Elapsed Time: 00:00:00.0000]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
Parameter --genomeFA found. Adding reference mismatch testing.
NOTE: Function "overlapMatch" requires function "mismatchEngine". Adding "mismatchEngine" to the active function list...
Running functions: CigarOpDistribution, GCDistribution, GeneCalcs, InsertSize, 
        JunctionCalcs, NVC, QualityScoreDistribution, StrandCheck, 
        chromCounts, cigarLocusCounts, mismatchEngine, overlapMatch, 
        readLengthDistro, referenceMatch, writeBiotypeCounts, 
        writeClippedNVC, writeDESeq, writeDEXSeq, writeGeneBody, 
        writeGeneCounts, writeGenewiseGeneBody, 
        writeJunctionSeqCounts, writeKnownSplices, 
        writeNovelSplices, writeSpliceExon
Checking first 10000 reads. Checking SAM file for formatting errors...
   Stats on the first 10000 reads:
        Num Reads Primary Map:    10000
        Num Reads Paired-ended:   10000
        Num Reads mapped pair:    9995
        Num Pair names found:     5272
        Num Pairs matched:        4723
        Read Seq length:          63 to 118
        Unclipped Read length:    63 to 118
        Final maxReadLength:      125
        maxPhredScore:            37
        minPhredScore:            2
NOTE: Read length is not consistent.
   In the first 10000 reads, read length varies from 63 to 118 (param maxReadLength=125)
Note that using data that is hard-clipped prior to alignment is NOT recommended, because this makes it difficult (or impossible) to determine the sequencer read-cycle of each nucleotide base. This may obfuscate cycle-specific artifacts, trends, or errors, the detection of which is one of the primary purposes of QoRTs!In addition, hard clipping (whether before or after alignment) removes quality score data, and thus quality score metrics may be misleadingly optimistic. A MUCH preferable method of removing undesired sequence is to replace such sequence with N's, which preserves the quality score and the sequencer cycle information.
   Note: Data appears to be paired-ended.
   Sorting Note: Reads are not sorted by name (This is OK).
   Sorting Note: Reads are sorted by position (This is OK).
Done checking first 10000 reads. WARNINGS FOUND!
Starting getSRPairIterResorted...
SAMRecord Reader Generated. Read length: 125.
[Time: 2023-02-16 10:58:11] [Mem usage: [731MB / 2595MB]] [Elapsed Time: 00:00:04.0795]
> Init GeneCalcs Utility
> Init InsertSize Utility
> Init NVC utility
> Init CigarOpDistribution Utility
> Init QualityScoreDistribution Utility
> Init GC counts Utility
> Init JunctionCalcs utility
length of knownSpliceMap after instantiation: 256778
length of knownCountMap after instantiation: 256778
> Init StrandCheck Utility
> Init chromCount Utility
> Init qcCigarLocusCounts Utility
> Init OverlapMatch Utility
> Init MinorUtils Utility
QC Utilities Generated!
[Time: 2023-02-16 10:59:49] [Mem usage: [5GB / 12GB]] [Elapsed Time: 00:01:43.0188]
helper_calculateGeneAssignmentMap_strict. Found: 31956 genes in the supplied annotation.
helper_calculateGeneAssignmentMap_strict. Found: 4912 genes with ambiguous segments.
helper_calculateGeneAssignmentMap_strict. Found: 27044 genes after first-pass filtering
making makeGeneIntervalMap for geneBody calculations. Found: 27044 acceptable genes for gene-body analysis.
..........[1000000 Read-Pairs processed] [Time: 2023-02-16 11:04:22] 
   [GenomeSeqContainer Status: buf:(10:22878000-23066000) n=188, MaxSoFar=895]
..
NOTE: Unsorted Read-PAIR-Buffer Size > 100000 [Mem usage:[23GB / 29GB]]
  Currently searching for read: A01901:60:H37HJDRX2:2:2261:24912:3176 for 98140 iterations.  Searching for read: A01901:60:H37HJDRX2:2:2261:24912:3176 10:29443811-29443858 99
  Current unmatched-pair-buffer status: 3898
    (This is generally not a problem, but if this increases further then OutOfMemoryExceptions
    may occur.
    If memory errors do occur, either increase memory allocation or sort the bam-file by name
    and rerun with the '--nameSorted' option.
    This might also indicate that your dataset contains an unusually large number of
    chimeric read-pairs. Or it could occur simply due to the presence of genomic
    loci with extremly high coverage or complex splicing. It may also indicate a SAM/BAM file that 
    does not adhere to the standard SAM specification.)
........[2000000 Read-Pairs processed] [Time: 2023-02-16 11:08:01] 
   [GenomeSeqContainer Status: buf:(10:42536000-43137000) n=601, MaxSoFar=895]
...Switching to Chromosome: 11 [2023-02-16 11:09:37] ... 
   Skipping chrom "10" in genome fasta...
    found chrom 11 [2023-02-16 11:09:37]
.......[3000000 Read-Pairs processed] [Time: 2023-02-16 11:12:16] 
   [GenomeSeqContainer Status: buf:(11:10585000-11164000) n=579, MaxSoFar=895]
..........[4000000 Read-Pairs processed] [Time: 2023-02-16 11:15:47] 
   [GenomeSeqContainer Status: buf:(11:35965000-36266000) n=301, MaxSoFar=895]
.....
NOTE: Unmatched Read Buffer Size > 100000 [Mem usage:[3691MB / 28GB]]
    (This is generally not a problem, but if this increases further then OutOfMemoryExceptions
    may occur.
    If memory errors do occur, either increase memory allocation or sort the bam-file by name
    and rerun with the '--nameSorted' option.
    This might also indicate that your dataset contains an unusually large number of
    chimeric read-pairs. Or it could occur simply due to the presence of genomic
    loci with extremly high coverage. It may also indicate a SAM/BAM file that 
    does not adhere to the standard SAM specification.)
NOTE: Unsorted Read-PAIR-Buffer Size > 200000 [Mem usage:[4692MB / 28GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:1117:7473:19633 for 167820 iterations.  Searching for read: A01901:60:H37HJDRX2:1:1117:7473:19633 11:44043229-44043346 163
  Current unmatched-pair-buffer status: 68944
...Switching to Chromosome: 12 [2023-02-16 11:18:56] ... 
   Skipping chrom "11" in genome fasta...
    found chrom 12 [2023-02-16 11:18:56]
..[5000000 Read-Pairs processed] [Time: 2023-02-16 11:19:24] 
   [GenomeSeqContainer Status: buf:(12:3273000-3637000) n=364, MaxSoFar=895]
..........[6000000 Read-Pairs processed] [Time: 2023-02-16 11:23:09] 
   [GenomeSeqContainer Status: buf:(12:26537000-26792000) n=255, MaxSoFar=1015]
.........Switching to Chromosome: 13 [2023-02-16 11:26:24] ... 
   Skipping chrom "12" in genome fasta...
    found chrom 13 [2023-02-16 11:26:24]
.[7000000 Read-Pairs processed] [Time: 2023-02-16 11:26:38] 
   [GenomeSeqContainer Status: buf:(13:353000-570000) n=217, MaxSoFar=1015]
..........[8000000 Read-Pairs processed] [Time: 2023-02-16 11:30:10] 
   [GenomeSeqContainer Status: buf:(13:28167000-28222000) n=55, MaxSoFar=1015]
........Switching to Chromosome: 14 [2023-02-16 11:33:14] ... 
   Skipping chrom "13" in genome fasta...
    found chrom 14 [2023-02-16 11:33:14]
..[9000000 Read-Pairs processed] [Time: 2023-02-16 11:33:37] 
   [GenomeSeqContainer Status: buf:(14:2038000-2676000) n=638, MaxSoFar=1015]
..........[10000000 Read-Pairs processed] [Time: 2023-02-16 11:37:16] 
   [GenomeSeqContainer Status: buf:(14:20852000-20942000) n=90, MaxSoFar=1015]
..........[11000000 Read-Pairs processed] [Time: 2023-02-16 11:40:50] 
   [GenomeSeqContainer Status: buf:(14:32694000-33088000) n=394, MaxSoFar=1015]
......
NOTE: Unmatched Read Buffer Size > 200000 [Mem usage:[7GB / 40GB]]
NOTE: Unsorted Read-PAIR-Buffer Size > 400000 [Mem usage:[8GB / 40GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 for 374755 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 14:46035424-46647919 163
  Current unmatched-pair-buffer status: 363660
NOTE: Unsorted Read-PAIR-Buffer Size > 800000 [Mem usage:[12GB / 40GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 for 774755 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2204:10384:7952 14:46035424-46647919 163
  Current unmatched-pair-buffer status: 338363
NOTE: Unsorted Read-PAIR-Buffer Size > 1600000 [Mem usage:[17GB / 40GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 for 687930 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2109:23194:2895 14:46090552-46651262 163
  Current unmatched-pair-buffer status: 50659
NOTE: Unmatched Read Buffer Size > 400000 [Mem usage:[20GB / 40GB]]
....[12000000 Read-Pairs processed] [Time: 2023-02-16 11:45:16] 
   [GenomeSeqContainer Status: buf:(14:46637000-47216000) n=579, MaxSoFar=1143]
..........[13000000 Read-Pairs processed] [Time: 2023-02-16 11:48:54] 
   [GenomeSeqContainer Status: buf:(14:46638000-47216000) n=578, MaxSoFar=1143]
..........[14000000 Read-Pairs processed] [Time: 2023-02-16 11:52:26] 
   [GenomeSeqContainer Status: buf:(14:46641000-47216000) n=575, MaxSoFar=1143]
...Switching to Chromosome: 15 [2023-02-16 11:54:04] ... 
   Skipping chrom "14" in genome fasta...
    found chrom 15 [2023-02-16 11:54:04]
.......[15000000 Read-Pairs processed] [Time: 2023-02-16 11:56:22] 
   [GenomeSeqContainer Status: buf:(15:20530000-20553000) n=23, MaxSoFar=1143]
..........[16000000 Read-Pairs processed] [Time: 2023-02-16 12:00:12] 
   [GenomeSeqContainer Status: buf:(15:41730000-42114000) n=384, MaxSoFar=1143]
...Switching to Chromosome: 16 [2023-02-16 12:01:17] ... 
   Skipping chrom "15" in genome fasta...
    found chrom 16 [2023-02-16 12:01:17]
.......[17000000 Read-Pairs processed] [Time: 2023-02-16 12:04:00] 
   [GenomeSeqContainer Status: buf:(16:6871000-7286000) n=415, MaxSoFar=1280]
..........[18000000 Read-Pairs processed] [Time: 2023-02-16 12:07:56] 
   [GenomeSeqContainer Status: buf:(16:24269000-24536000) n=267, MaxSoFar=1280]
..........[19000000 Read-Pairs processed] [Time: 2023-02-16 12:11:46] 
   [GenomeSeqContainer Status: buf:(16:38980000-39499000) n=519, MaxSoFar=1280]
.........Switching to Chromosome: 17 [2023-02-16 12:15:34] ... 
   Skipping chrom "16" in genome fasta...
    found chrom 17 [2023-02-16 12:15:34]
.[20000000 Read-Pairs processed] [Time: 2023-02-16 12:15:44] 
   [GenomeSeqContainer Status: buf:(17:107000-937000) n=830, MaxSoFar=1280]
..........[21000000 Read-Pairs processed] [Time: 2023-02-16 12:19:39] 
..........[22000000 Read-Pairs processed] [Time: 2023-02-16 12:23:23] 
   [GenomeSeqContainer Status: buf:(17:49873000-50262000) n=389, MaxSoFar=1280]
.Switching to Chromosome: 18 [2023-02-16 12:23:56] ... 
   Skipping chrom "17" in genome fasta...
    found chrom 18 [2023-02-16 12:23:56]
.........[23000000 Read-Pairs processed] [Time: 2023-02-16 12:27:16] 
   [GenomeSeqContainer Status: buf:(18:16146000-16736000) n=590, MaxSoFar=1280]
..........[24000000 Read-Pairs processed] [Time: 2023-02-16 12:31:00] 
   [GenomeSeqContainer Status: buf:(18:44876000-45135000) n=259, MaxSoFar=1280]
...Switching to Chromosome: 19 [2023-02-16 12:32:11] ... 
   Skipping chrom "18" in genome fasta...
    found chrom 19 [2023-02-16 12:32:11]
.......[25000000 Read-Pairs processed] [Time: 2023-02-16 12:34:53] 
   [GenomeSeqContainer Status: buf:(19:8818000-9129000) n=311, MaxSoFar=1280]
..........[26000000 Read-Pairs processed] [Time: 2023-02-16 12:38:59] 
   [GenomeSeqContainer Status: buf:(19:22220000-22533000) n=313, MaxSoFar=1280]
..........[27000000 Read-Pairs processed] [Time: 2023-02-16 12:43:04] 
   [GenomeSeqContainer Status: buf:(19:28689000-28946000) n=257, MaxSoFar=1280]
..........[28000000 Read-Pairs processed] [Time: 2023-02-16 12:47:01] 
   [GenomeSeqContainer Status: buf:(19:43513000-43858000) n=345, MaxSoFar=1280]
..........[29000000 Read-Pairs processed] [Time: 2023-02-16 12:50:57] 
   [GenomeSeqContainer Status: buf:(19:48449000-48779000) n=330, MaxSoFar=1280]
Switching to Chromosome: 1 [2023-02-16 12:51:17] ... 
   Skipping chrom "19" in genome fasta...
    found chrom 1 [2023-02-16 12:51:17]
..........[30000000 Read-Pairs processed] [Time: 2023-02-16 12:54:42] 
   [GenomeSeqContainer Status: buf:(1:11639000-11916000) n=277, MaxSoFar=1280]
..........[31000000 Read-Pairs processed] [Time: 2023-02-16 12:58:30] 
   [GenomeSeqContainer Status: buf:(1:37922000-38207000) n=285, MaxSoFar=1280]
..........[32000000 Read-Pairs processed] [Time: 2023-02-16 13:02:19] 
   [GenomeSeqContainer Status: buf:(1:54481000-54717000) n=236, MaxSoFar=1280]
.Switching to Chromosome: 20 [2023-02-16 13:02:52] ... 
   Skipping chrom "1" in genome fasta...
    found chrom 20 [2023-02-16 13:02:52]
.........[33000000 Read-Pairs processed] [Time: 2023-02-16 13:06:07] 
   [GenomeSeqContainer Status: buf:(20:21020000-21880000) n=860, MaxSoFar=1280]
..........[34000000 Read-Pairs processed] [Time: 2023-02-16 13:09:55] 
   [GenomeSeqContainer Status: buf:(20:43877000-44381000) n=504, MaxSoFar=1280]
........Switching to Chromosome: 21 [2023-02-16 13:13:04] ... 
   Skipping chrom "20" in genome fasta...
    found chrom 21 [2023-02-16 13:13:04]
..[35000000 Read-Pairs processed] [Time: 2023-02-16 13:13:27] 
   [GenomeSeqContainer Status: buf:(21:1313000-2726000) n=1413, MaxSoFar=1465]
..........[36000000 Read-Pairs processed] [Time: 2023-02-16 13:17:18] 
   [GenomeSeqContainer Status: buf:(21:22275000-22543000) n=268, MaxSoFar=1465]
..........[37000000 Read-Pairs processed] [Time: 2023-02-16 13:21:07] 
   [GenomeSeqContainer Status: buf:(21:39576000-39784000) n=208, MaxSoFar=1465]
...Switching to Chromosome: 22 [2023-02-16 13:22:26] ... 
   Skipping chrom "21" in genome fasta...
    found chrom 22 [2023-02-16 13:22:26]
.......[38000000 Read-Pairs processed] [Time: 2023-02-16 13:24:55] 
   [GenomeSeqContainer Status: buf:(22:10363000-10569000) n=206, MaxSoFar=1465]
..........[39000000 Read-Pairs processed] [Time: 2023-02-16 13:28:48] 
   [GenomeSeqContainer Status: buf:(22:31702000-32081000) n=379, MaxSoFar=1465]
..Switching to Chromosome: 23 [2023-02-16 13:29:42] ... 
   Skipping chrom "22" in genome fasta...
    found chrom 23 [2023-02-16 13:29:42]
........[40000000 Read-Pairs processed] [Time: 2023-02-16 13:32:38] 
   [GenomeSeqContainer Status: buf:(23:17493000-17920000) n=427, MaxSoFar=1465]
..........[41000000 Read-Pairs processed] [Time: 2023-02-16 13:36:30] 
   [GenomeSeqContainer Status: buf:(23:26604000-26866000) n=262, MaxSoFar=1465]
..........[42000000 Read-Pairs processed] [Time: 2023-02-16 13:40:03] 
   [GenomeSeqContainer Status: buf:(23:36199000-36408000) n=209, MaxSoFar=1465]
....Switching to Chromosome: 24 [2023-02-16 13:41:46] ... 
   Skipping chrom "23" in genome fasta...
    found chrom 24 [2023-02-16 13:41:46]
......[43000000 Read-Pairs processed] [Time: 2023-02-16 13:43:50] 
   [GenomeSeqContainer Status: buf:(24:18406000-18482000) n=76, MaxSoFar=1465]
..........[44000000 Read-Pairs processed] [Time: 2023-02-16 13:47:45] 
   [GenomeSeqContainer Status: buf:(24:40941000-41557000) n=616, MaxSoFar=1465]
Switching to Chromosome: 25 [2023-02-16 13:47:56] ... 
   Skipping chrom "24" in genome fasta...
    found chrom 25 [2023-02-16 13:47:56]
..........[45000000 Read-Pairs processed] [Time: 2023-02-16 13:51:36] 
   [GenomeSeqContainer Status: buf:(25:19263000-19381000) n=118, MaxSoFar=1465]
......Switching to Chromosome: 2 [2023-02-16 13:53:59] ... 
   Skipping chrom "25" in genome fasta...
    found chrom 2 [2023-02-16 13:53:59]
....[46000000 Read-Pairs processed] [Time: 2023-02-16 13:55:24] 
   [GenomeSeqContainer Status: buf:(2:9982000-10172000) n=190, MaxSoFar=1465]
..........[47000000 Read-Pairs processed] [Time: 2023-02-16 13:59:19] 
   [GenomeSeqContainer Status: buf:(2:26953000-27126000) n=173, MaxSoFar=1465]
..........[48000000 Read-Pairs processed] [Time: 2023-02-16 14:03:09] 
   [GenomeSeqContainer Status: buf:(2:45628000-46188000) n=560, MaxSoFar=1465]
......Switching to Chromosome: 3 [2023-02-16 14:05:36] ... 
   Skipping chrom "2" in genome fasta...
    found chrom 3 [2023-02-16 14:05:36]
....[49000000 Read-Pairs processed] [Time: 2023-02-16 14:06:49] 
   [GenomeSeqContainer Status: buf:(3:7759000-8537000) n=778, MaxSoFar=1865]
..........[50000000 Read-Pairs processed] [Time: 2023-02-16 14:10:53] 
   [GenomeSeqContainer Status: buf:(3:20797000-21282000) n=485, MaxSoFar=1865]
..........[51000000 Read-Pairs processed] [Time: 2023-02-16 14:14:46] 
   [GenomeSeqContainer Status: buf:(3:29727000-29873000) n=146, MaxSoFar=1865]
..........[52000000 Read-Pairs processed] [Time: 2023-02-16 14:18:43] 
   [GenomeSeqContainer Status: buf:(3:35987000-36253000) n=266, MaxSoFar=1865]
..........[53000000 Read-Pairs processed] [Time: 2023-02-16 14:22:27] 
   [GenomeSeqContainer Status: buf:(3:48839000-49061000) n=222, MaxSoFar=1865]
.......Switching to Chromosome: 4 [2023-02-16 14:25:16] ... 
   Skipping chrom "3" in genome fasta...
    found chrom 4 [2023-02-16 14:25:16]
...[54000000 Read-Pairs processed] [Time: 2023-02-16 14:26:10] 
   [GenomeSeqContainer Status: buf:(4:963000-1375000) n=412, MaxSoFar=1865]
..........[55000000 Read-Pairs processed] [Time: 2023-02-16 14:29:58] 
   [GenomeSeqContainer Status: buf:(4:21918000-22076000) n=158, MaxSoFar=1865]
......Switching to Chromosome: 5 [2023-02-16 14:31:51] ... 
   Skipping chrom "4" in genome fasta...
    found chrom 5 [2023-02-16 14:31:51]
NOTE: Unmatched Read Buffer Size > 800000 [Mem usage:[37GB / 170GB]]
....[56000000 Read-Pairs processed] [Time: 2023-02-16 14:32:28] 
   [GenomeSeqContainer Status: buf:(5:816000-1083000) n=267, MaxSoFar=3118]
..........[57000000 Read-Pairs processed] [Time: 2023-02-16 14:35:34] 
   [GenomeSeqContainer Status: buf:(5:817000-1396000) n=579, MaxSoFar=3118]
..........[58000000 Read-Pairs processed] [Time: 2023-02-16 14:38:54] 
   [GenomeSeqContainer Status: buf:(5:1680000-2529000) n=849, MaxSoFar=3118]
..........[59000000 Read-Pairs processed] [Time: 2023-02-16 14:42:53] 
   [GenomeSeqContainer Status: buf:(5:22754000-23242000) n=488, MaxSoFar=3118]
..........[60000000 Read-Pairs processed] [Time: 2023-02-16 14:46:48] 
   [GenomeSeqContainer Status: buf:(5:28892000-29478000) n=586, MaxSoFar=3118]
..........[61000000 Read-Pairs processed] [Time: 2023-02-16 14:50:39] 
   [GenomeSeqContainer Status: buf:(5:38777000-39137000) n=360, MaxSoFar=3118]
..........[62000000 Read-Pairs processed] [Time: 2023-02-16 14:53:59] 
   [GenomeSeqContainer Status: buf:(5:64160000-64448000) n=288, MaxSoFar=3118]
..Switching to Chromosome: 6 [2023-02-16 14:54:54] ... 
   Skipping chrom "5" in genome fasta...
    found chrom 6 [2023-02-16 14:54:54]
........[63000000 Read-Pairs processed] [Time: 2023-02-16 14:57:44] 
   [GenomeSeqContainer Status: buf:(6:9770000-9961000) n=191, MaxSoFar=3118]
..........[64000000 Read-Pairs processed] [Time: 2023-02-16 15:01:36] 
   [GenomeSeqContainer Status: buf:(6:30857000-31275000) n=418, MaxSoFar=3118]
..........[65000000 Read-Pairs processed] [Time: 2023-02-16 15:05:28] 
   [GenomeSeqContainer Status: buf:(6:52225000-52533000) n=308, MaxSoFar=3118]
...Switching to Chromosome: 7 [2023-02-16 15:06:47] ... 
   Skipping chrom "6" in genome fasta...
    found chrom 7 [2023-02-16 15:06:47]
.......[66000000 Read-Pairs processed] [Time: 2023-02-16 15:09:16] 
   [GenomeSeqContainer Status: buf:(7:21487000-21862000) n=375, MaxSoFar=3118]
..........[67000000 Read-Pairs processed] [Time: 2023-02-16 15:13:10] 
   [GenomeSeqContainer Status: buf:(7:37461000-38022000) n=561, MaxSoFar=3118]
..........[68000000 Read-Pairs processed] [Time: 2023-02-16 15:17:13] 
   [GenomeSeqContainer Status: buf:(7:47164000-47369000) n=205, MaxSoFar=3118]
..........[69000000 Read-Pairs processed] [Time: 2023-02-16 15:21:14] 
   [GenomeSeqContainer Status: buf:(7:63686000-64101000) n=415, MaxSoFar=3118]
....Switching to Chromosome: 8 [2023-02-16 15:22:48] ... 
   Skipping chrom "7" in genome fasta...
    found chrom 8 [2023-02-16 15:22:48]
......[70000000 Read-Pairs processed] [Time: 2023-02-16 15:25:02] 
   [GenomeSeqContainer Status: buf:(8:15452000-15916000) n=464, MaxSoFar=3118]
..........[71000000 Read-Pairs processed] [Time: 2023-02-16 15:28:51] 
   [GenomeSeqContainer Status: buf:(8:39721000-40159000) n=438, MaxSoFar=3118]
.......Switching to Chromosome: 9 [2023-02-16 15:31:48] ... 
   Skipping chrom "8" in genome fasta...
    found chrom 9 [2023-02-16 15:31:48]
...[72000000 Read-Pairs processed] [Time: 2023-02-16 15:32:40] 
..........[73000000 Read-Pairs processed] [Time: 2023-02-16 15:36:23] 
   [GenomeSeqContainer Status: buf:(9:30925000-31051000) n=126, MaxSoFar=3118]
..........[74000000 Read-Pairs processed] [Time: 2023-02-16 15:40:20] 
   [GenomeSeqContainer Status: buf:(9:54602000-55141000) n=539, MaxSoFar=3118]
..Switching to Chromosome: MT [2023-02-16 15:41:14] ... 
   Skipping chrom "9" in genome fasta...
    found chrom MT [2023-02-16 15:41:14]
NOTE: Unmatched Read Buffer Size > 1600000 [Mem usage:[112GB / 120GB]]
NOTE: Unmatched Read Buffer Size > 3200000 [Mem usage:[14GB / 144GB]]
NOTE: Unmatched Read Buffer Size > 6400000 [Mem usage:[29GB / 144GB]]
NOTE: Unmatched Read Buffer Size > 12800000 [Mem usage:[54GB / 144GB]]
NOTE: Unsorted Read-PAIR-Buffer Size > 3200000 [Mem usage:[90GB / 144GB]]
  Currently searching for read: A01901:60:H37HJDRX2:2:2122:16514:22482 for 3145621 iterations.  Searching for read: A01901:60:H37HJDRX2:2:2122:16514:22482 MT:264-273 99
  Current unmatched-pair-buffer status: 16488350
NOTE: Unsorted Read-PAIR-Buffer Size > 6400000 [Mem usage:[107GB / 144GB]]
  Currently searching for read: A01901:60:H37HJDRX2:2:2122:16514:22482 for 6345621 iterations.  Searching for read: A01901:60:H37HJDRX2:2:2122:16514:22482 MT:264-273 99
  Current unmatched-pair-buffer status: 14583726
NOTE: Unsorted Read-PAIR-Buffer Size > 12800000 [Mem usage:[102GB / 241GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2165:21938:9706 for 4305562 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2165:21938:9706 MT:304-9811 99
  Current unmatched-pair-buffer status: 10315126
NOTE: Unsorted Read-PAIR-Buffer Size > 25600000 [Mem usage:[168GB / 241GB]]
  Currently searching for read: A01901:60:H37HJDRX2:1:2165:21938:9706 for 17105562 iterations.  Searching for read: A01901:60:H37HJDRX2:1:2165:21938:9706 MT:304-9811 99
  Current unmatched-pair-buffer status: 953517
........[75000000 Read-Pairs processed] [Time: 2023-02-16 16:33:07] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[76000000 Read-Pairs processed] [Time: 2023-02-16 16:36:42] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[77000000 Read-Pairs processed] [Time: 2023-02-16 16:40:28] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[78000000 Read-Pairs processed] [Time: 2023-02-16 16:44:30] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[79000000 Read-Pairs processed] [Time: 2023-02-16 16:48:28] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[80000000 Read-Pairs processed] [Time: 2023-02-16 16:52:01] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[81000000 Read-Pairs processed] [Time: 2023-02-16 16:55:51] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[82000000 Read-Pairs processed] [Time: 2023-02-16 16:59:38] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[83000000 Read-Pairs processed] [Time: 2023-02-16 17:02:51] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[84000000 Read-Pairs processed] [Time: 2023-02-16 17:06:19] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[85000000 Read-Pairs processed] [Time: 2023-02-16 17:09:45] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[86000000 Read-Pairs processed] [Time: 2023-02-16 17:12:50] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[87000000 Read-Pairs processed] [Time: 2023-02-16 17:16:03] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[88000000 Read-Pairs processed] [Time: 2023-02-16 17:19:28] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[89000000 Read-Pairs processed] [Time: 2023-02-16 17:22:43] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[90000000 Read-Pairs processed] [Time: 2023-02-16 17:26:13] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[91000000 Read-Pairs processed] [Time: 2023-02-16 17:29:43] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[92000000 Read-Pairs processed] [Time: 2023-02-16 17:33:28] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[93000000 Read-Pairs processed] [Time: 2023-02-16 17:36:38] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[94000000 Read-Pairs processed] [Time: 2023-02-16 17:40:01] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[95000000 Read-Pairs processed] [Time: 2023-02-16 17:43:42] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[96000000 Read-Pairs processed] [Time: 2023-02-16 17:47:15] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[97000000 Read-Pairs processed] [Time: 2023-02-16 17:50:49] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[98000000 Read-Pairs processed] [Time: 2023-02-16 17:54:29] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[99000000 Read-Pairs processed] [Time: 2023-02-16 17:58:11] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[100000000 Read-Pairs processed] [Time: 2023-02-16 18:02:05] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[101000000 Read-Pairs processed] [Time: 2023-02-16 18:05:32] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[102000000 Read-Pairs processed] [Time: 2023-02-16 18:09:00] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[103000000 Read-Pairs processed] [Time: 2023-02-16 18:12:20] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[104000000 Read-Pairs processed] [Time: 2023-02-16 18:15:41] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[105000000 Read-Pairs processed] [Time: 2023-02-16 18:19:18] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[106000000 Read-Pairs processed] [Time: 2023-02-16 18:22:51] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[107000000 Read-Pairs processed] [Time: 2023-02-16 18:26:28] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[108000000 Read-Pairs processed] [Time: 2023-02-16 18:29:51] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[109000000 Read-Pairs processed] [Time: 2023-02-16 18:33:04] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[110000000 Read-Pairs processed] [Time: 2023-02-16 18:36:24] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[111000000 Read-Pairs processed] [Time: 2023-02-16 18:39:50] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[112000000 Read-Pairs processed] [Time: 2023-02-16 18:43:33] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[113000000 Read-Pairs processed] [Time: 2023-02-16 18:46:49] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[114000000 Read-Pairs processed] [Time: 2023-02-16 18:50:18] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[115000000 Read-Pairs processed] [Time: 2023-02-16 18:53:55] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[116000000 Read-Pairs processed] [Time: 2023-02-16 18:57:02] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[117000000 Read-Pairs processed] [Time: 2023-02-16 19:00:49] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[118000000 Read-Pairs processed] [Time: 2023-02-16 19:04:08] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[119000000 Read-Pairs processed] [Time: 2023-02-16 19:07:09] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[120000000 Read-Pairs processed] [Time: 2023-02-16 19:10:30] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[121000000 Read-Pairs processed] [Time: 2023-02-16 19:13:32] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[122000000 Read-Pairs processed] [Time: 2023-02-16 19:16:52] 
   [GenomeSeqContainer Status: buf:(MT:0-17000) n=17, MaxSoFar=3118]
..........[123000000 Read-Pairs processed] [Time: 2023-02-16 19:20:14] 
   [GenomeSeqContainer Status: buf:(MT:2000-17000) n=15, MaxSoFar=3118]
..........[124000000 Read-Pairs processed] [Time: 2023-02-16 19:24:17] 
   [GenomeSeqContainer Status: buf:(MT:5000-17000) n=12, MaxSoFar=3118]
..........[125000000 Read-Pairs processed] [Time: 2023-02-16 19:27:49] 
   [GenomeSeqContainer Status: buf:(MT:5000-17000) n=12, MaxSoFar=3118]
..........[126000000 Read-Pairs processed] [Time: 2023-02-16 19:31:41] 
   [GenomeSeqContainer Status: buf:(MT:5000-17000) n=12, MaxSoFar=3118]
..........[127000000 Read-Pairs processed] [Time: 2023-02-16 19:35:36] 
   [GenomeSeqContainer Status: buf:(MT:6000-17000) n=11, MaxSoFar=3118]
.Switching to Chromosome: EGFP [2023-02-16 19:36:19] ... 
   Skipping chrom "MT" in genome fasta...
    found chrom EGFP [2023-02-16 19:36:20]
Switching to Chromosome: GAL4FF [2023-02-16 19:36:20] ... 
   Skipping chrom "EGFP" in genome fasta...
    found chrom GAL4FF [2023-02-16 19:36:20]
Finished reading SAM. Read: 127145524 reads/read-pairs.
Finished reading SAM. Used: 124692211 reads/read-pairs.
[Time: 2023-02-16 19:40:19] [Mem usage: [346GB / 447GB]] [Elapsed Time: 08:42:12.0840]
> Read Stats:
>   READ_PAIR_OK                   124692211
>   TOTAL_READ_PAIRS               127145524
>   DROPPED_NOT_PROPER_PAIR        0
>   DROPPED_READ_FAILS_VENDOR_QC   0
>   DROPPED_MARKED_NOT_VALID       0
>   DROPPED_CHROMS_MISMATCH        0
>   DROPPED_PAIR_STRANDS_MISMATCH  0
>   DROPPED_IGNORED_CHROMOSOME     0
>   DROPPED_NOT_UNIQUE_ALIGNMENT   2453313
>   DROPPED_NO_ALN_BLOCKS   0
>   DROPPED_NOT_MARKED_RG   -1
Pre-alignment read count unknown (Set --seqReadCt or --rawfastq)
Writing Output...
DEBUG NOTE: IncludeGenesSet.size: 27044
DEBUG NOTE: sortedReadCountSeq.size: 18899
DEBUG NOTE: coverageThresholds: 9449;14174;17009;18899
DEBUG NOTE: coverageSpans: [(0,9449);(9449,14174);(14174,17009);(17009,18899)]
DEBUG NOTE:	[1.bottomHalf][0.5] = [0,9449]
DEBUG NOTE:	[2.upperMidQuartile][0.75] = [9449,14174]
DEBUG NOTE:	[3.75to90][0.9] = [14174,17009]
DEBUG NOTE:	[4.high][1.0] = [17009,18899]
      (DEBUG) Generating Biotype Map [2023-02-16 19:40:29]
      (DEBUG) Extracted gene BioType using key "gene_biotype".
              Found 34 types: [TR_V_gene,unprocessed_pseudogene,protein_coding,IG_V_gene,TR_J_gene,Mt_tRNA,rRNA,TEC,miRNA,scaRNA,TR_D_gene,snRNA,TR_V_pseudogene,snoRNA,IG_J_pseudogene,processed_transcript,IG_V_pseudogene,IG_J_gene,processed_pseudogene,IG_C_pseudogene,sense_overlapping,transcribed_unprocessed_pseudogene,lincRNA,IG_C_gene,misc_RNA,ribozyme,polymorphic_pseudogene,User,antisense,Mt_rRNA,pseudogene,sRNA,IG_pseudogene,sense_intronic]
      (DEBUG) Finished Biotype Map [2023-02-16 19:42:20]
length of knownCountMap after run: 256778
WARNING: QoRTs is unable to infer the strandedness from the data!
         This isn't a problem per-se, since QoRTs requires that strandedness
         mode be set manually. However, it might be indicative that something
         is very wrong with your dataset and/or transcript annotation.
QoRTs completed WITH WARNINGS! See log for details.
Done.
Time spent on setup:           00:01:43.0189
Time spent on SAM iteration:   08:40:29.0662
                               (4.093603274098217 minutes per million read-pairs)
                               (4.174144713283923 minutes per million read-pairs used)
Time spent on file output:     00:02:23.0395
Total runtime:                 08:44:36.0246
Done. (Thu Feb 16 19:42:42 CET 2023)
End of Script. Script took 31541 seconds.

multiplot-sample-sub

QoRTs multiplot on this sample. Many of the plots are empty.

@hartleys
Copy link
Owner

Hmm. Can you give me an ls of the output dir?

And then check inside one of the files, say "QC.insert.size.txt.gz"?

How did you do the downsampling? It may have had problems if the majority of the reads did not have matched pairs.

Also: can you post the log?

@royfrancis
Copy link
Author

royfrancis commented Feb 17, 2023

Output file list
QC.biotypeCounts.txt
QC.chromCount.txt
QC.cigarLoci.deletionCounts.all.txt
QC.cigarLoci.deletionCounts.highCoverage.txt
QC.cigarLoci.insertionCounts.all.txt
QC.cigarLoci.insertionCounts.highCoverage.txt
QC.cigarOpDistribution.byReadCycle.R1.txt
QC.cigarOpDistribution.byReadCycle.R2.txt
QC.cigarOpLengths.byOp.R1.txt
QC.cigarOpLengths.byOp.R2.txt
QC.exonCounts.formatted.for.DEXSeq.txt
QC.FTnRrt5rbVMr.log
QC.gc.byPair.txt
QC.gc.byRead.txt
QC.gc.byRead.vsBaseCt.txt
QC.gc.R1.txt
QC.gc.R2.txt
QC.geneBodyCoverage.byExpr.avgPct.txt
QC.geneBodyCoverage.by.expression.level.txt
QC.geneBodyCoverage.genewise.txt
QC.geneCounts.formatted.for.DESeq.txt
QC.geneCounts.txt
QC.insert.size.byReadLen.txt
QC.insert.size.debug.dropped.txt
QC.insert.size.debug.txt
QC.insert.size.txt
QC.mismatchSizeRates.txt
QC.mismatchSummary.txt
QC.NVC.lead.clip.R1.txt
QC.NVC.lead.clip.R2.txt
QC.NVC.minus.clipping.R1.txt
QC.NVC.minus.clipping.R2.txt
QC.NVC.raw.R1.txt
QC.NVC.raw.R2.txt
QC.NVC.tail.clip.R1.txt
QC.NVC.tail.clip.R2.txt
QC.orderedChromList.txt
QC.overlapCoverage.txt
QC.overlapMismatch.byBase.txt
QC.overlapMismatch.byRead.txt
QC.overlapMismatch.byScoreAndBP.txt
QC.overlapMismatch.byScore.txt
QC.overlapMismatch.txt
QC.QORTS_COMPLETED_OK
QC.QORTS_COMPLETED_WARN
QC.QORTS_RUNNING
QC.quals.r1.txt
QC.quals.r2.txt
QC.readLenDist.txt
QC.referenceMismatch.byScoreAndBP.txt
QC.referenceMismatch.byScore.txt
QC.referenceMismatchCounts.txt
QC.referenceMismatchRaw.byReadStrand.txt
QC.spliceJunctionAndExonCounts.forJunctionSeq.txt
QC.spliceJunctionCounts.knownSplices.txt
QC.spliceJunctionCounts.novelSplices.txt
QC.summary.txt
QC.yX9gr2Yu8Jsk.log

WARN file says this:

# Note: if this file EXISTS, then QoRTs QC completed WITH WARNINGS. Warning messages follow:
Warning: run-in-progress file "sample-sub-qorts/QC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
WARNING: QoRTs is unable to infer the strandedness from the data!
         This isn't a problem per-se, since QoRTs requires that strandedness
         mode be set manually. However, it might be indicative that something
         is very wrong with your dataset and/or transcript annotation.
QoRTs completed WITH WARNINGS! See log for details.

Contents of QC.insert.size.txt

$ head sample-sub-qorts/QC.insert.size.txt 
InsertSize	Ct
0	0
1	0
2	0
3	0
4	0
5	0
6	0
7	0
8	0

$ tail sample-sub-qorts/QC.insert.size.txt 
980064	1
1035994	1
1155833	1
1155846	3
1321162	1
1321165	2
1321172	2
1321176	1
1321183	1
1822321	1

Subsampling was done as such:

module load samtools/1.3
samtools view -b -s 0.6 sample.bam > sample-sub.bam

I have two other samples (BAM files) which ran fine without downsampling or memory issues. They also had about 20% fewer reads. But, they also produced the warning about strand and many blank plots in the multi plots. So I am not sure if downsampling is the reason for this. It could be one of the many other issues that you mentioned.

Here is the plotting script and log.

Plot log
library(QoRTs)
res <- read.qc.results.data(infile.dir="data/raw/zumis/qorts/", decoder.files = "data/raw/zumis/qorts/decoder.txt",autodetectMissingSamples=TRUE)

column 'qc.data.prefix' not found in the decoder, assuming qc.data.prefix = ""
Note: no input.read.pair.count column found. This column is optional, but without it mapping rates cannot be calculated.
Note: no multi.mapped.read.pair.count column found. This column is optional, but without it (depending on how your aligner implements multi-mapping) multi-mapping rates might not be plotted.
infile.dir = data/raw/zumis/qorts/
scalaqc_file = QC.summary.txt.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Autodetected Paired-End mode.
(File 1 of 43): QC.gc.byPair.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 2 of 43): QC.gc.byRead.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 3 of 43): QC.gc.byRead.vsBaseCt.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 4 of 43): QC.quals.r1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 5 of 43): QC.quals.r2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 6 of 43): QC.cigarOpDistribution.byReadCycle.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 7 of 43): QC.cigarOpDistribution.byReadCycle.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 8 of 43): QC.cigarOpLengths.byOp.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.02 secs]
(File 9 of 43): QC.cigarOpLengths.byOp.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.02 secs]
(File 10 of 43): QC.geneBodyCoverage.by.expression.level.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 11 of 43): QC.geneCounts.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.04 secs]
(File 12 of 43): QC.insert.size.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.05 secs]
(File 13 of 43): QC.NVC.raw.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 14 of 43): QC.NVC.raw.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 15 of 43): QC.NVC.lead.clip.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.02 secs]
(File 16 of 43): QC.NVC.lead.clip.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.04 secs]
(File 17 of 43): QC.NVC.tail.clip.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.03 secs]
(File 18 of 43): QC.NVC.tail.clip.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.03 secs]
(File 19 of 43): QC.NVC.minus.clipping.R1.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 20 of 43): QC.NVC.minus.clipping.R2.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 21 of 43): QC.chromCount.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 22 of 43): QC.biotypeCounts.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 23 of 43): QC.geneBodyCoverage.byExpr.avgPct.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 24 of 43): QC.overlapCoverage.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 25 of 43): QC.overlapMismatch.byRead.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 26 of 43): QC.overlapMismatch.byScore.txt.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 27 of 43): QC.overlapMismatch.byBase.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 28 of 43): QC.overlapMismatch.byScoreAndBP.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.03 secs]
(File 29 of 43): QC.readLenDist.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 30 of 43): QC.referenceMismatchCounts.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 31 of 43): QC.referenceMismatchRaw.byReadStrand.txt.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
(File 32 of 43): QC.referenceMismatch.byScore.txt.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 33 of 43): QC.referenceMismatch.byScoreAndBP.txt.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 34 of 43): QC.mismatchSizeRates.txt.gz.done.
   [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
(File 35 of 43): QC.FQ.gc.byRead.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.gc.byRead.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 36 of 43): QC.FQ.gc.byPair.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.gc.byPair.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 37 of 43): QC.FQ.gc.R1.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.gc.R1.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 38 of 43): QC.FQ.gc.R2.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.gc.R2.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 39 of 43): QC.FQ.NVC.R1.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.NVC.R1.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 40 of 43): QC.FQ.NVC.R2.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.NVC.R2.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 41 of 43): QC.FQ.quals.r1.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.quals.r1.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 42 of 43): QC.FQ.quals.r2.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.quals.r2.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
(File 43 of 43): QC.FQ.readLenDist.txt.gzFailed: Cannot find file: data/raw/zumis/qorts/30dpf-sub-qorts/QC.FQ.readLenDist.txt.gz. Skipping tests that use this data.
   [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
calculating secondary data:
Calculating Quality Score Rates...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating cumulative gene coverage, by replicate...done. [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
Calculating cumulative gene coverage, by sample...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating Mapping Rates...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
calculating normalization factors, by sample...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
calculating normalization factors, by replicate...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
calculating normalization factors, by sample/replicate...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating summary stats...done. [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
Calculating overlap mismatch-size rates...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating cumulative overlap mismatch-size rates...done. [time: 2023-02-17 11:27:09],[elapsed: 0.03 secs]
Calculating overlap coverage Rates...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap coverage Rates By Read...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating read length distribution...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap by AVG score...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap by MIN score...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Adding Min score error to summary tables...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap by R1 score...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap by R2 score...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating referenceMismatchCounts stats...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating referenceMismatch.byScore stats...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating referenceMismatchRaw.byReadStrand stats...done. [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
Calculating referenceMismatch.byScoreAndBP stats...done. [time: 2023-02-17 11:27:09],[elapsed: 0.01 secs]
Calculating summary table...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlap mismatch combos...Calculating mismatch combo rates:...done. [time: 2023-02-17 11:27:09],[elapsed: 0 secs]
Calculating overlapMismatch.byScoreAndBP stats...done. [time: 2023-02-17 11:27:10],[elapsed: 0.55 secs]
done. [time: 2023-02-17 11:27:10],[elapsed: 0.56 secs]
Calculating NVC rates...done. [time: 2023-02-17 11:27:10],[elapsed: 0.05 secs]
done.
[time: 2023-02-17 11:27:10],[elapsed: 0.69 secs]
Skipping: "onTarget.rates","onTarget.counts","overlap.mismatch.byAvgQual"
Rasterize large plots: FALSE
Rasterize medium plots: FALSE
Skipping due to missing data: "mapping.rates","norm.factors","norm.vs.TC"
Plotting to the currently-open device...
Plotting extended...
Starting compiled plot...
null device 
          1 

@hartleys
Copy link
Owner

hartleys commented Feb 20, 2023 via email

@royfrancis
Copy link
Author

royfrancis commented Feb 25, 2023

500 lines of insert size

Insert Size
$ head -500 QC.insert.size.txt 
InsertSize	Ct
0	0
1	0
2	0
3	0
4	0
5	0
6	0
7	0
8	0
9	0
10	0
11	0
12	0
13	0
14	0
15	0
16	1
17	9
18	9
19	3
20	2
21	2
22	1
23	16
24	14
25	27
26	50
27	53
28	66
29	172
30	81
31	103
32	293
33	103
34	160
35	116
36	106
37	160
38	147
39	197
40	217
41	105
42	117
43	159
44	152
45	204
46	172
47	258
48	265
49	228
50	326
51	189
52	239
53	173
54	289
55	271
56	570
57	1240
58	59640
59	87598
60	157234
61	66937
62	71341
63	53061
64	27000
65	18377
66	19845
67	19960
68	18363
69	18983
70	23432
71	21124
72	20022
73	21429
74	18033
75	17383
76	18315
77	18698
78	20664
79	24500
80	68901
81	60539
82	64878
83	69174
84	66908
85	59720
86	56638
87	67972
88	63187
89	64589
90	64062
91	63578
92	64307
93	69494
94	63117
95	65724
96	62436
97	67277
98	63380
99	66513
100	66466
101	71515
102	69962
103	71410
104	73335
105	78041
106	76025
107	69028
108	69512
109	70052
110	70529
111	79839
112	72271
113	124319
114	122282
115	122449
116	124412
117	142020
118	166582
119	239607
120	372431
121	505082
122	240310
123	226321
124	233692
125	265236
126	475243
127	240638
128	263637
129	241704
130	279017
131	339567
132	256077
133	653230
134	268081
135	252661
136	384324
137	307867
138	352451
139	297923
140	398459
141	264757
142	268405
143	315080
144	288190
145	337896
146	276336
147	277009
148	462523
149	271486
150	295669
151	321542
152	283935
153	326496
154	288825
155	336537
156	281186
157	307664
158	312668
159	426211
160	305038
161	388685
162	323370
163	297768
164	316399
165	358222
166	382747
167	412732
168	338350
169	424533
170	374563
171	362098
172	344140
173	348566
174	552290
175	394478
176	321790
177	348267
178	374816
179	327981
180	367405
181	381411
182	441740
183	468710
184	368974
185	358864
186	379367
187	390232
188	452099
189	387880
190	436000
191	369812
192	378372
193	412208
194	450890
195	386683
196	373977
197	533712
198	405176
199	404649
200	406269
201	378857
202	425324
203	395905
204	597632
205	1301920
206	486930
207	447109
208	402321
209	387888
210	401512
211	781089
212	693961
213	466506
214	413397
215	437600
216	421373
217	520441
218	445268
219	532208
220	456514
221	433206
222	416687
223	488708
224	429191
225	434593
226	460544
227	430857
228	450473
229	436757
230	431798
231	486690
232	776357
233	422793
234	409099
235	416835
236	426599
237	715289
238	488193
239	442650
240	585901
241	435689
242	444152
243	436647
244	428768
245	1148477
246	399705
247	434309
248	455262
249	479593
250	436653
251	477488
252	528618
253	704487
254	479866
255	427654
256	412713
257	428468
258	430080
259	562194
260	464509
261	703258
262	418314
263	433300
264	373629
265	374279
266	404222
267	658972
268	372415
269	374427
270	370258
271	372728
272	389580
273	354233
274	404060
275	366348
276	425936
277	369730
278	353342
279	494877
280	344769
281	483421
282	354558
283	334601
284	442987
285	353446
286	336950
287	343088
288	368300
289	535209
290	371898
291	351777
292	345007
293	308206
294	393970
295	708242
296	346534
297	328592
298	296338
299	346341
300	295271
301	266166
302	312359
303	292345
304	357556
305	391358
306	323948
307	751195
308	293197
309	651166
310	299225
311	279659
312	339363
313	283095
314	620971
315	334242
316	257542
317	262985
318	260833
319	426052
320	308990
321	291909
322	246837
323	272518
324	233598
325	402850
326	260351
327	236414
328	237087
329	239225
330	262224
331	236702
332	265793
333	215020
334	271660
335	226407
336	414604
337	235193
338	501403
339	269455
340	220418
341	217617
342	215120
343	258349
344	231492
345	415804
346	200754
347	217006
348	258059
349	194916
350	259488
351	172721
352	193666
353	195597
354	185406
355	254261
356	168437
357	172563
358	165991
359	176199
360	442129
361	164411
362	159551
363	154943
364	167164
365	235471
366	167066
367	242742
368	266777
369	239879
370	237212
371	145163
372	143892
373	141814
374	227232
375	169118
376	167336
377	292432
378	178525
379	148721
380	139105
381	164141
382	151485
383	131033
384	119986
385	144885
386	121424
387	140399
388	277183
389	143188
390	163776
391	127696
392	130411
393	111648
394	166541
395	113544
396	131896
397	106142
398	107851
399	106086
400	119453
401	127097
402	122592
403	125669
404	104015
405	97465
406	96807
407	96373
408	98182
409	101707
410	103714
411	97016
412	94409
413	211853
414	109362
415	97031
416	97236
417	84272
418	83237
419	97841
420	97590
421	119733
422	89049
423	81981
424	84109
425	92483
426	156811
427	89280
428	84574
429	83699
430	111149
431	95117
432	77905
433	88704
434	83277
435	92809
436	112704
437	69045
438	66494
439	68579
440	72822
441	78703
442	68204
443	62768
444	66443
445	69627
446	65667
447	64578
448	64686
449	63022
450	71708
451	62043
452	60983
453	57432
454	56493
455	68109
456	55479
457	58225
458	54039
459	57088
460	66978
461	60150
462	49749
463	52728
464	50713
465	52951
466	56714
467	52996
468	48991
469	53235
470	51141
471	50091
472	50479
473	46312
474	51362
475	44767
476	44368
477	51617
478	50031
479	46308
480	48184
481	47648
482	44881
483	50809
484	52606
485	43196
486	44873
487	40805
488	44500
489	49127
490	40096
491	65080
492	40458
493	44779
494	44516
495	40870
496	49055
497	36519
498	36280
QC.quals.r1
$ cat QC.quals.r1.txt
readLen	min	lowerQuartile	median	upperQuartile	max
0	3	38	38	38	38
1	3	38	38	38	38
2	3	38	38	38	38
3	3	38	38	38	38
4	12	38	38	38	38
5	3	38	38	38	38
6	12	38	38	38	38
7	3	38	38	38	38
8	3	38	38	38	38
9	12	38	38	38	38
10	12	38	38	38	38
11	3	38	38	38	38
12	3	38	38	38	38
13	12	38	38	38	38
14	3	38	38	38	38
15	12	38	38	38	38
16	3	38	38	38	38
17	12	38	38	38	38
18	3	38	38	38	38
19	3	38	38	38	38
20	3	38	38	38	38
21	3	38	38	38	38
22	3	38	38	38	38
23	3	38	38	38	38
24	3	38	38	38	38
25	3	38	38	38	38
26	12	38	38	38	38
27	12	38	38	38	38
28	12	38	38	38	38
29	3	38	38	38	38
30	3	38	38	38	38
31	3	38	38	38	38
32	12	38	38	38	38
33	3	38	38	38	38
34	3	38	38	38	38
35	3	38	38	38	38
36	3	38	38	38	38
37	3	38	38	38	38
38	3	38	38	38	38
39	3	38	38	38	38
40	3	38	38	38	38
41	3	38	38	38	38
42	3	38	38	38	38
43	12	38	38	38	38
44	3	38	38	38	38
45	3	38	38	38	38
46	3	38	38	38	38
47	3	38	38	38	38
48	3	38	38	38	38
49	12	38	38	38	38
50	3	38	38	38	38
51	3	38	38	38	38
52	3	38	38	38	38
53	3	38	38	38	38
54	12	38	38	38	38
55	12	38	38	38	38
56	3	38	38	38	38
57	3	38	38	38	38
58	3	38	38	38	38
59	3	38	38	38	38
60	3	38	38	38	38
61	3	38	38	38	38
62	12	38	38	38	38
63	12	38	38	38	38
64	3	38	38	38	38
65	12	38	38	38	38
66	12	38	38	38	38
67	3	38	38	38	38
68	3	38	38	38	38
69	12	38	38	38	38
70	3	38	38	38	38
71	12	38	38	38	38
72	3	38	38	38	38
73	3	38	38	38	38
74	3	38	38	38	38
75	3	38	38	38	38
76	12	38	38	38	38
77	12	38	38	38	38
78	3	38	38	38	38
79	3	38	38	38	38
80	3	38	38	38	38
81	3	38	38	38	38
82	3	38	38	38	38
83	3	38	38	38	38
84	12	38	38	38	38
85	-1	0	0	0	0
86	-1	0	0	0	0
87	-1	0	0	0	0
88	-1	0	0	0	0
89	-1	0	0	0	0
90	-1	0	0	0	0
91	-1	0	0	0	0
92	-1	0	0	0	0
93	-1	0	0	0	0
94	-1	0	0	0	0
95	-1	0	0	0	0
96	-1	0	0	0	0
97	-1	0	0	0	0
98	-1	0	0	0	0
99	-1	0	0	0	0
100	-1	0	0	0	0
101	-1	0	0	0	0
102	-1	0	0	0	0
103	-1	0	0	0	0
104	-1	0	0	0	0
105	-1	0	0	0	0
106	-1	0	0	0	0
107	-1	0	0	0	0
108	-1	0	0	0	0
109	-1	0	0	0	0
110	-1	0	0	0	0
111	-1	0	0	0	0
112	-1	0	0	0	0
113	-1	0	0	0	0
114	-1	0	0	0	0
115	-1	0	0	0	0
116	-1	0	0	0	0
117	-1	0	0	0	0
118	-1	0	0	0	0
119	-1	0	0	0	0
120	-1	0	0	0	0
121	-1	0	0	0	0
122	-1	0	0	0	0
123	-1	0	0	0	0
124	-1	0	0	0	0

QoRTs/1.3.6
R/4.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants