Possible bug while making a lot of batch injections. InterruptedException: sleep interrupted #6081

fernandosoto138 · 2018-07-31T11:34:14Z

Hi, I'm trying to load 100 batch files of 1.5 gigabytes each one with 20 millions of registers each one.

My structure is the following, each dot is a different server

Zookeeper
Druid broker
Druid coordinator - overlord
Postgre - historical - middle manager (with 3 workers) this have 6 cores and 32gb of ram.
Hadoop namenode
five hadoop datanodes with 200gb each one.

Almost all of the tasks are running with SUCCESS but some random tasks throw the next message:

task[HadoopIndexTask{id=index_hadoop_sales2_2018-07-30T20:35:48.876Z, type=index_hadoop, dataSource=sales2}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:222) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:238) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.1.jar:0.12.1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.1.jar:0.12.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:219) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	... 7 more
Caused by: java.lang.RuntimeException: java.lang.InterruptedException: sleep interrupted
	at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:229) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:293) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:219) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	... 7 more
Caused by: java.lang.InterruptedException: sleep interrupted
	at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_181]
	at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1353) ~[?:?]
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1311) ~[?:?]
	at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:212) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:369) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:293) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_181]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:219) ~[druid-indexing-service-0.12.1.jar:0.12.1]
	... 7 more
2018-07-31T06:59:15,123 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_sales2_2018-07-30T20:35:48.876Z] status changed to [FAILED].
2018-07-31T06:59:15,139 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_hadoop_sales2_2018-07-30T20:35:48.876Z",
  "status" : "FAILED",
  "duration" : 589783
}

is this a issue because the load average? it's oscilating between 8 and 11.

If this is a bug, is there a way for running the failed tasks automatiacally?

The text was updated successfully, but these errors were encountered:

stale · 2019-06-20T23:30:17Z

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

stale · 2020-04-09T21:52:54Z

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

stale · 2020-05-07T22:00:41Z

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

stale bot added the stale label Jun 20, 2019

gianm removed stale labels Jul 4, 2019

stale bot added the stale label Apr 9, 2020

stale bot closed this as completed May 7, 2020

soltan mentioned this issue Dec 27, 2023

[Snyk] Security upgrade axios from 0.21.1 to 1.6.3 soltan/druid#649

Open

terrorizer1980 mentioned this issue Dec 27, 2023

[Snyk] Security upgrade axios from 0.21.1 to 1.6.3 terrorizer1980/druid#648

Closed

ajesse11x mentioned this issue Dec 28, 2023

[Snyk] Security upgrade axios from 0.18.0 to 1.6.3 ajesse11x/incubator-druid#952

Open

terrorizer1980 mentioned this issue Jan 5, 2024

[Snyk] Security upgrade axios from 0.21.1 to 1.6.4 terrorizer1980/druid#651

Closed

ajesse11x mentioned this issue Jan 6, 2024

[Snyk] Security upgrade axios from 0.18.0 to 1.6.4 ajesse11x/incubator-druid#957

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible bug while making a lot of batch injections. InterruptedException: sleep interrupted #6081

Possible bug while making a lot of batch injections. InterruptedException: sleep interrupted #6081

fernandosoto138 commented Jul 31, 2018

stale bot commented Jun 20, 2019

stale bot commented Apr 9, 2020

stale bot commented May 7, 2020

Possible bug while making a lot of batch injections. InterruptedException: sleep interrupted #6081

Possible bug while making a lot of batch injections. InterruptedException: sleep interrupted #6081

Comments

fernandosoto138 commented Jul 31, 2018

stale bot commented Jun 20, 2019

stale bot commented Apr 9, 2020

stale bot commented May 7, 2020