Skip to content

Conversation

@droberts195
Copy link

This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.

The default timeout of 10 seconds is the same
as the hardcoded timeout in previous versions.

The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.

This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.

The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195
Copy link
Author

droberts195 commented Jun 14, 2019

I deliberately didn't use the configurable timeout for the controller process, because at the point that's started we cannot possibly be starting lots of other processes simultaneously so contention should not be a problem and it's best that we don't wait a long time before failing the startup of Elasticsearch.

such an external process.

`xpack.ml.process_connect_timeout` (<<cluster-update-settings,Dynamic>>)::
Some {ml} processing is done by processes that run separately to the {es} JVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a brief introductory explanation here, akin to what we have in other timeout definitions.
e.g.

The connect(ion?) timeout for {ml} processes that run separately from the {es} JVM. Defaults to 10s. When such processes...

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@droberts195
Copy link
Author

Jenkins run elasticsearch-ci/1

@droberts195 droberts195 merged commit 76ad7d8 into elastic:master Jun 25, 2019
@droberts195 droberts195 deleted the make_connect_timeout_a_setting branch June 25, 2019 15:36
droberts195 pushed a commit that referenced this pull request Jun 26, 2019
This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.

The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.
droberts195 pushed a commit that referenced this pull request Jun 26, 2019
This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.

The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.
droberts195 pushed a commit that referenced this pull request Jun 26, 2019
This change introduces a new setting,
xpack.ml.process_connect_timeout, to enable
the timeout for one of the external ML processes
to connect to the ES JVM to be increased.

The timeout may need to be increased if many
processes are being started simultaneously on
the same machine. This is unlikely in clusters
with many ML nodes, as we balance the processes
across the ML nodes, but can happen in clusters
with a single ML node and a high value for
xpack.ml.node_concurrent_job_allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants