Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QoRTs error after aligning with STAR #66

Open
ghost opened this issue Jul 2, 2018 · 7 comments
Open

QoRTs error after aligning with STAR #66

ghost opened this issue Jul 2, 2018 · 7 comments

Comments

@ghost
Copy link

ghost commented Jul 2, 2018

Hello Hartley, I am getting this QoRTs error after cleaning the data by Trimmomatic and aligning the pair-ends reads by STAR. Please let me know where I am going wrong. I really appreciate your help.
My commands:-

java -Xmx16G -jar /work/LAS/vollbrec-lab/sharmistha/QoRTs.jar QC --generatePlots --stranded --verbose --maxReadLength 90 --genomeFA /work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.fna.gz --rawfastq /work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R1_001.fastq.gz,/work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R2_001.fastq.gz --generateSeparatePlots --outfilePrefix ${output} --chromSizes /work/LAS/vollbrec-lab/sharmistha/star_files/GenomeDirectory/chrLength.txt --addFunctions mismatchEngine,annotatedSpliceExonCounts,FPKM,writeGeneBodyIv,fastqUtils,referenceMatch,writeDocs,makeJunctionBed,makeAllBrowserTracks,calcDetailedGeneCounts /work/LAS/vollbrec-lab/sharmistha/star_files/star_output_files/NB-rep-1_ATCACG_L001/${input1} /work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.gtf.gz /work/LAS/vollbrec-lab/sharmistha/QoRTs_files/${output}/

Starting QoRTs v1.3.0 (Compiled Fri Oct 20 11:56:37 EDT 2017)
Starting time: (Mon Jul 02 14:04:41 CDT 2018)
INPUT_COMMAND(QC)
INPUT_ARG(infile)=/work/LAS/vollbrec-lab/sharmistha/star_files/star_output_files/NB-rep-1_ATCACG_L001/NB-rep-1_ATCACG_L001Aligned.sortedByCoord.out.bam
INPUT_ARG(gtffile)=/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.gtf.gz
INPUT_ARG(outdir)=/work/LAS/vollbrec-lab/sharmistha/QoRTs_files/NB-rep-1_ATCACG_L001Aligned/
INPUT_ARG(generatePlots)=true
INPUT_ARG(stranded)=true
INPUT_ARG(verbose)=true
INPUT_ARG(maxReadLength)=Some(90)
INPUT_ARG(genomeFA)=Some(List(/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.fna.gz))
INPUT_ARG(rawfastq)=Some(List(/work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R1_001.fastq.gz, /work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R2_001.fastq.gz))
INPUT_ARG(generateSeparatePlots)=true
INPUT_ARG(outfilePrefix)=NB-rep-1_ATCACG_L001Aligned
INPUT_ARG(chromSizes)=Some(/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeDirectory/chrLength.txt)
INPUT_ARG(addFunctions)=List(mismatchEngine, annotatedSpliceExonCounts, FPKM, writeGeneBodyIv, fastqUtils, referenceMatch, writeDocs, makeJunctionBed, makeAllBrowserTracks, calcDetailedGeneCounts)
Created Log File: /work/LAS/vollbrec-lab/sharmistha/QoRTs_files/NB-rep-1_ATCACG_L001Aligned//NB-rep-1_ATCACG_L001AlignedQC.6JZ9kTs4cAFB.log
Starting QC
[Time: 2018-07-02 14:04:41] [Mem usage: [73MB / 2024MB]] [Elapsed Time: 00:00:00.0000]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
Chromosome size file added. Adding target wiggle plot generation.
Raw fastq files specified. Adding fastq testing.
Parameter --genomeFA found. Adding reference mismatch testing.
Running functions: CigarOpDistribution, FPKM, GCDistribution, GeneCalcs, InsertSize,
JunctionCalcs, NVC, QualityScoreDistribution, StrandCheck,
annotatedSpliceExonCounts, calcDetailedGeneCounts,
chromCounts, cigarLocusCounts, fastqUtils,
makeAllBrowserTracks, makeJunctionBed, makeWiggles,
mismatchEngine, overlapMatch, readLengthDistro,
referenceMatch, writeBiotypeCounts, writeClippedNVC,
writeDESeq, writeDEXSeq, writeDocs, writeGeneBody,
writeGeneBodyIv, writeGeneCounts, writeGenewiseGeneBody,
Checking first 10000 reads. Checking SAM file for formatting errors...
Stats on the first 10000 reads:
Num Reads Primary Map: 9103
Num Reads Paired-ended: 10000
Num Reads mapped pair: 9098
Num Pair names found: 4797
Num Pairs matched: 4301
Read Seq length: 36 to 90
Unclipped Read length: 36 to 90
Final maxReadLength: 90
maxPhredScore: 41
minPhredScore: 2
NOTE: Read length is not consistent.
In the first 10000 reads, read length varies from 36 to 90 (param maxReadLength=90)
Note that using data that is hard-clipped prior to alignment is NOT recommended, because this makes it difficult (or impossible) to determine the sequencer read-cycle of each nucleotide base. This may obfuscate cycle-specific artifacts, trends, or errors, the detection of which is one of the primary purposes of QoRTs!In addition, hard clipping (whether before or after alignment) removes quality score data, and thus quality score metrics may be misleadingly optimistic. A MUCH preferable method of removing undesired sequence is to replace such sequence with N's, which preserves the quality score and the sequencer cycle information.
Note: Data appears to be paired-ended.
Sorting Note: Reads are not sorted by name (This is OK).
Sorting Note: Reads are sorted by position (This is OK).
Done checking first 10000 reads. No major problems detected!
Starting getSRPairIterResorted...
SAMRecord Reader Generated. Read length: 90.
[Time: 2018-07-02 14:04:44] [Mem usage: [285MB / 2552MB]] [Elapsed Time: 00:00:03.0401]
Starting fastq readthrough.

Init FastqGC Utility
Init FastqQualityScore Utility
Init FastqNVC Utility
============================FATAL_ERROR============================
QoRTs encountered a FATAL ERROR. For general help, use command:
java -jar path/to/jar/QoRTs.jar --man
============================FATAL_ERROR============================
Error info:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 100
at qcUtils.fqcGC.runOnReadPair(fqcGC.scala:43)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$5(runAllQC.scala:1187)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
at scala.collection.IterableLike.foreach(IterableLike.scala:71)
at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$4(runAllQC.scala:1187)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$4$adapted(runAllQC.scala:1186)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1186)
at qcUtils.runAllQC$.run(runAllQC.scala:960)
at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:672)
at runner.runner$.main(runner.scala:97)
at runner.runner.main(runner.scala)

@hartleys
Copy link
Owner

hartleys commented Jul 5, 2018 via email

@ghost
Copy link
Author

ghost commented Jul 5, 2018

Dear Hartleys,
Thank you for your suggestion. It started working when I changed maxreadlength to 100. However, it is giving a different error now.

Thank you for your help.
Sharmistha

Starting QoRTs v1.3.0 (Compiled Fri Oct 20 11:56:37 EDT 2017)
Starting time: (Thu Jul 05 15:42:34 CDT 2018)
INPUT_COMMAND(QC)
INPUT_ARG(infile)=/work/LAS/vollbrec-lab/sharmistha/star_files/star_output_files/coordinate-sorted_BAMfiles/NB-rep-1_ATCACG_L001Aligned.sortedByCoord.out.bam
INPUT_ARG(gtffile)=/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.gtf.gz
INPUT_ARG(outdir)=/work/LAS/vollbrec-lab/sharmistha/QoRTs_files/NB-rep-1_ATCACG_L001Aligned/
INPUT_ARG(generatePlots)=true
INPUT_ARG(stranded)=true
INPUT_ARG(verbose)=true
INPUT_ARG(maxReadLength)=Some(100)
INPUT_ARG(genomeFA)=Some(List(/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeFiles/GCA_000005005.6_B73_RefGen_v4_genomic.fna.gz))
INPUT_ARG(rawfastq)=Some(List(/work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R1_001.fastq.gz, /work/LAS/vollbrec-lab/sharmistha/NB-rep-1_ATCACG_L001_R2_001.fastq.gz))
INPUT_ARG(generateSeparatePlots)=true
INPUT_ARG(outfilePrefix)=NB-rep-1_ATCACG_L001Aligned
INPUT_ARG(chromSizes)=Some(/work/LAS/vollbrec-lab/sharmistha/star_files/GenomeDirectory/chrLength.txt)
INPUT_ARG(skipFunctions)=List(cigarLocusCounts)
INPUT_ARG(addFunctions)=List(mismatchEngine, annotatedSpliceExonCounts, FPKM, writeGeneBodyIv, fastqUtils, referenceMatch, writeDocs, makeJunctionBed, calcDetailedGeneCounts)
Created Log File: /work/LAS/vollbrec-lab/sharmistha/QoRTs_files/NB-rep-1_ATCACG_L001Aligned//NB-rep-1_ATCACG_L001AlignedQC.bdRqM78mge2M.log
Warning: run-in-progress file "/work/LAS/vollbrec-lab/sharmistha/QoRTs_files/NB-rep-1_ATCACG_L001Aligned//NB-rep-1_ATCACG_L001AlignedQC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
Starting QC
[Time: 2018-07-05 15:42:34] [Mem usage: [73MB / 2024MB]] [Elapsed Time: 00:00:00.0000]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
Chromosome size file added. Adding target wiggle plot generation.
Raw fastq files specified. Adding fastq testing.
Parameter --genomeFA found. Adding reference mismatch testing.
Running functions: CigarOpDistribution, FPKM, GCDistribution, GeneCalcs, InsertSize,
JunctionCalcs, NVC, QualityScoreDistribution, StrandCheck,
annotatedSpliceExonCounts, calcDetailedGeneCounts,
chromCounts, fastqUtils, makeJunctionBed, makeWiggles,
mismatchEngine, overlapMatch, readLengthDistro,
referenceMatch, writeBiotypeCounts, writeClippedNVC,
writeDESeq, writeDEXSeq, writeDocs, writeGeneBody,
writeGeneBodyIv, writeGeneCounts, writeGenewiseGeneBody,
writeJunctionSeqCounts, writeKnownSplices,
writeNovelSplices, writeSpliceExon
Checking first 10000 reads. Checking SAM file for formatting errors...
Stats on the first 10000 reads:
Num Reads Primary Map: 9103
Num Reads Paired-ended: 10000
Num Reads mapped pair: 9098
Num Pair names found: 4797
Num Pairs matched: 4301
Read Seq length: 36 to 90
Unclipped Read length: 36 to 90
Final maxReadLength: 100
maxPhredScore: 41
minPhredScore: 2
NOTE: Read length is not consistent.
In the first 10000 reads, read length varies from 36 to 90 (param maxReadLength=100)

@hartleys
Copy link
Owner

hartleys commented Jul 6, 2018 via email

@ghost
Copy link
Author

ghost commented Jul 6, 2018 via email

@hartleys
Copy link
Owner

hartleys commented Jul 6, 2018 via email

@ghost
Copy link
Author

ghost commented Jul 6, 2018 via email

@hartleys
Copy link
Owner

hartleys commented Jul 7, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant