Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiscoverVariants runtime has big difference between repeated run with the same args. #231

Open
xubo245 opened this issue Apr 15, 2017 · 4 comments

Comments

@xubo245
Copy link

xubo245 commented Apr 15, 2017

I use DiscoverVariants(org.bdgenomics.avocado.cli.DiscoverVariants) to discover Variant,
the data is 8 million PE reads (by wgsim)

But the runtime has big difference between repeated run with the same args.

I try many time.

code:

    val startTime = System.currentTimeMillis()
    var sam = args(0)
    var out = args(1)
    var appArgs = "sam:" + sam + "\tout:" + out
    val sc = new SparkContext(conf)
    val ac = new ADAMContext(sc)
    DiscoverVariants(Array(sam, out)).run(sc)
    sc.stop
    val stopTime = System.currentTimeMillis()
    println(appArgs + "\ttime:\t" + (stopTime - startTime) / 1000.0 + "\t")

time:

hadoop@Master:~/disk2/xubo/project/callVariant/GCDSS$ tail -f discoverVariantDiffNumTesttime201704152123.txt 
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI6.adam	time:	902.211	
Apr 15, 2017 9:24:03 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:38:57 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI7.adam	time:	325.683	
Apr 15, 2017 9:39:08 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:44:26 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5

@fnothaft
Copy link
Member

Hi @xubo245 !

Sorry for the slow reply; I hadn't seen this issue when it came in! Are you running this locally? If you run 5 times, what do the runtimes look like? It is possible that you're seeing a warmup phenomena (e.g., file system buffering).

@xubo245
Copy link
Author

xubo245 commented Apr 23, 2017

I run avocado in cluster with Spark standalone.

run 5 time:

am:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam      time:   308.092
Apr 15, 2017 9:24:20 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:29:19 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam      time:   295.197
Apr 15, 2017 9:29:30 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:34:17 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam      time:   296.947
Apr 15, 2017 9:34:29 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:39:17 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam      time:   301.25
Apr 15, 2017 9:39:29 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:44:22 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam      time:   935.559
Apr 15, 2017 9:44:33 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:00:01 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam      time:   1019.173
Apr 15, 2017 10:00:12 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:17:03 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam      time:   1043.837
Apr 15, 2017 10:17:14 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:34:30 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam      time:   331.459
Apr 15, 2017 10:34:41 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:40:04 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam      time:   1034.815
Apr 15, 2017 10:40:16 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:57:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam      time:   1034.985
Apr 15, 2017 10:57:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:14:41 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam      time:   1142.983
Apr 15, 2017 11:14:53 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:33:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam      time:   1148.81
Apr 15, 2017 11:33:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:52:59 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam      time:   1161.003
Apr 15, 2017 11:53:10 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 12:12:23 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam      time:   1122.803
Apr 15, 2017 12:12:35 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 12:31:09 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam       out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam      time:   1139.485
Apr 15, 2017 12:31:21 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32

@xubo245
Copy link
Author

xubo245 commented Apr 23, 2017

I run 20 times:

sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI11.adam	time:	611.29	
Apr 16, 2017 12:07:07 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:17:09 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI12.adam	time:	1041.574	
Apr 16, 2017 12:17:21 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:34:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI13.adam	time:	1046.914	
Apr 16, 2017 12:34:45 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:52:04 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI14.adam	time:	978.584	
Apr 16, 2017 12:52:16 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:08:26 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI15.adam	time:	1017.985	
Apr 16, 2017 1:08:37 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:25:25 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI16.adam	time:	1037.953	
Apr 16, 2017 1:25:38 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:42:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI17.adam	time:	1026.294	
Apr 16, 2017 1:42:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:59:57 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI18.adam	time:	1000.112	
Apr 16, 2017 2:00:08 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:16:40 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI19.adam	time:	1031.265	
Apr 16, 2017 2:16:52 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:33:54 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI20.adam	time:	1033.526	
Apr 16, 2017 2:34:05 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:51:10 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI21.adam	time:	335.346	
Apr 16, 2017 2:51:21 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:56:48 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI22.adam	time:	333.994	
Apr 16, 2017 2:57:00 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:02:25 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI23.adam	time:	1011.96	
Apr 16, 2017 3:02:37 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:19:20 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI24.adam	time:	1006.177	
Apr 16, 2017 3:19:32 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:36:10 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI25.adam	time:	1038.076	
Apr 16, 2017 3:36:22 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:53:31 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI26.adam	time:	1030.243	
Apr 16, 2017 3:53:42 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:10:44 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI27.adam	time:	1033.402	
Apr 16, 2017 4:10:56 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:28:01 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI28.adam	time:	1017.483	
Apr 16, 2017 4:28:13 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:45:02 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI29.adam	time:	1007.373	
Apr 16, 2017 4:45:14 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:01:53 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI30.adam	time:	902.883	
Apr 16, 2017 5:02:04 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:16:59 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI11.adam	time:	1116.18	
Apr 16, 2017 5:17:11 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:35:38 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI12.adam	time:	1086.454	
Apr 16, 2017 5:35:49 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:53:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI13.adam	time:	1109.689	
Apr 16, 2017 5:53:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:12:20 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI14.adam	time:	1130.608	
Apr 16, 2017 6:12:31 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:31:14 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI15.adam	time:	1146.735	
Apr 16, 2017 6:31:25 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:50:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI16.adam	time:	1141.368	
Apr 16, 2017 6:50:35 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:09:28 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI17.adam	time:	1136.241	
Apr 16, 2017 7:09:39 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:28:27 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI18.adam	time:	1144.389	
Apr 16, 2017 7:28:38 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:47:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI19.adam	time:	1138.622	
Apr 16, 2017 7:47:46 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:06:36 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI20.adam	time:	1119.333	
Apr 16, 2017 8:06:48 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:25:18 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI21.adam	time:	360.353	
Apr 16, 2017 8:25:30 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:31:22 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI22.adam	time:	1101.976	
Apr 16, 2017 8:31:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:49:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI23.adam	time:	1183.02	
Apr 16, 2017 8:49:58 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:09:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI24.adam	time:	1088.011	
Apr 16, 2017 9:09:45 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:27:45 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI25.adam	time:	1115.471	
Apr 16, 2017 9:27:56 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:46:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI26.adam	time:	1134.819	
Apr 16, 2017 9:46:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:05:21 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI27.adam	time:	1127.239	
Apr 16, 2017 10:05:32 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:24:11 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI28.adam	time:	1122.376	
Apr 16, 2017 10:24:23 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:42:56 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI29.adam	time:	369.243	
Apr 16, 2017 10:43:08 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:49:09 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam	out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI30.adam	time:	1133.132	
Apr 16, 2017 10:49:20 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 11:08:05 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5

@fnothaft
Copy link
Member

Oh yeah, those times are really all over the map! Do you have the job history server enabled on your Spark cluster? If so, I am wondering if you have any failed tasks during the slower runs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants