Skip to content

[ML] DF Analytics times out after 30 minutes #45723

@dimitris-athanasiou

Description

@dimitris-athanasiou

If the analytics process takes more than 30 minutes to complete, the task is timing out.

The logs show this with:

[2019-08-14T00:03:27,523][WARN ][o.e.x.m.d.p.AnalyticsResultProcessor] [reba.attlocal.net] [allstate-train] Timeout waiting for results processor to complete
[2019-08-14T00:03:27,527][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [reba.attlocal.net] [allstate-train] Result processor has completed
[2019-08-14T00:03:27,527][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [reba.attlocal.net] [allstate-train] Closing process
[2019-08-14T00:03:32,534][WARN ][o.e.x.m.p.AbstractNativeProcess] [reba.attlocal.net] [allstate-train] Exception closing the running analytics process
java.util.concurrent.TimeoutException: null
    at java.util.concurrent.FutureTask.get(FutureTask.java:204) ~[?:?]
    at org.elasticsearch.xpack.ml.process.AbstractNativeProcess.close(AbstractNativeProcess.java:163) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.closeProcess(AnalyticsProcessManager.java:195) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.processData(AnalyticsProcessManager.java:104) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.xpack.ml.dataframe.process.AnalyticsProcessManager.lambda$runJob$1(AnalyticsProcessManager.java:69) [x-pack-ml-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
    at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-08-14T00:03:32,550][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [reba.attlocal.net] [allstate-train] Closed process
[2019-08-14T00:03:32,550][INFO ][o.e.x.m.d.p.AnalyticsProcessManager] [reba.attlocal.net] [allstate-train] Marking task completed

The reason is a misplaced timeout in the results processor.

Metadata

Metadata

Labels

:mlMachine learning>bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions