Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Workaround required for unreachable resident tasks #5284

Closed
timcharper opened this issue Mar 1, 2017 · 2 comments
Closed

Workaround required for unreachable resident tasks #5284

timcharper opened this issue Mar 1, 2017 · 2 comments
Assignees

Comments

@timcharper
Copy link
Contributor

Presently, on agent reboot, the agent will come up with a new ID. This presents a problem because it means that Mesos will never be able to report that a LOST task as gone, and the instance will forever be in the Unreachable state, even though it's associated reservation is reoffered. We expect this to be fixed by https://issues.apache.org/jira/browse/MESOS-6223, which is slated to be released with Mesos 1.3.0.

Until then, we plan to monitor the offer stream, watch for reservations, and then check associated instance state. If the instance state is unreachable, then we will emit a terminal TASK_GONE mesos update so that the task can transition appropriately.

@timcharper
Copy link
Contributor Author

https://phabricator.mesosphere.com/D566 submitted; I've confirmed it to work manually in a local cluster setup.

@meichstedt
Copy link
Contributor

Note: This issue has been migrated to https://jira.mesosphere.com/browse/MARATHON-2311. For more information see https://groups.google.com/forum/#!topic/marathon-framework/khtvf-ifnp8.

@d2iq-archive d2iq-archive locked and limited conversation to collaborators Mar 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants