Writing to a BAM file with adamSAMSave consistently fails #721

danvk · 2015-07-01T21:55:53Z

I'm running this code on a yarn cluster. It's trying to filter a BAM file to just those alignments which are either on chr22 or have a mate on chr22.

override def run(args: Arguments, sc: SparkContext): Unit = {
  val filterContig = args.filterContig
  val alignments = sc.loadAlignments(args.reads)
  val matchingAlignments = alignments.filter(matchesContig(_, filterContig))
  matchingAlignments.persist()
  println("Found " + matchingAlignments.count() + " alignments with   one pair in " + filterContig)
  matchingAlignments.coalesce(10).adamSAMSave(args.outputPath, asSam = false)
}

I'm consistently getting this error:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 5.0 failed 4 times, most recent failure: Lost task 5.3 in stage 5.0 (TID 706, demeter-csmaz08-10.demeter.hpc.mssm.edu): java.lang.AssertionError: assertion failed: Cannot return header if not attached.

My command line is this:

spark-submit --master yarn --deploy-mode client --executor-memory 16g --driver-memory 10g --num-executors 1000 --executor-cores 1 --driver-java-options "-Dyarn.resourcemanager.am.max-attempts=1 -Dlog4j.configuration=scripts/log4j.properties" --class org.hammerlab.guacamole.Guacamole --verbose target/guacamole-with-dependencies-0.0.1-SNAPSHOT.jar structural-variant --reads hdfs:///datasets/dream/data/synthetic-challenge-4/synthetic.challenge.set4.tumour.bam --filter-contig 22 --out hdfs:///user/vanded03/synth4.tumor.chr22+mate.bam

(the input is from the dream challenge)

Would this be expected to work? cc @ryan-williams

arahuja · 2015-07-06T14:52:43Z

I was seeing the same issue in #676 - which was apparently fixed, but I haven't checked since.

danvk · 2015-07-06T17:39:58Z

@arahuja I believe that issue was specifically when you used .coalesce(1). I ran out of memory when I tried that, so I'm using .coalesce(10) and running into this issue.

ryan-williams · 2016-05-18T19:50:49Z

Closing as a ~dupe of #676

fnothaft mentioned this issue Jan 12, 2016

BAM header is not getting set on partition 0 with headerless BAM output format #916

Closed

ryan-williams mentioned this issue Jan 14, 2016

[ADAM-916] New strategy for writing header. #917

Merged

ryan-williams closed this as completed May 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Writing to a BAM file with adamSAMSave consistently fails #721

Writing to a BAM file with adamSAMSave consistently fails #721

danvk commented Jul 1, 2015

arahuja commented Jul 6, 2015

danvk commented Jul 6, 2015

ryan-williams commented May 18, 2016

Writing to a BAM file with adamSAMSave consistently fails #721

Writing to a BAM file with adamSAMSave consistently fails #721

Comments

danvk commented Jul 1, 2015

arahuja commented Jul 6, 2015

danvk commented Jul 6, 2015

ryan-williams commented May 18, 2016