-
Notifications
You must be signed in to change notification settings - Fork 842
Parent issue: Marathon does not re-use reserved resources for which a lost task is associated #4137
Comments
I suspect #4118 may partially or wholly fix this issue, but the disregard for max-over-capacity is still a problem. |
@timcharper thanks for reporting and providing the screencast. tl;dr: top prio on our tech-debt list for 1.2 We're aware of that and the next work item for me and @unterstein is to provide an implementation/functionality for specifying/fixing task lost behavior, both for tasks using persistent volumes and for normal tasks. #4118 will not fix that issue alone, but it contains necessary prerequisites to allow for a clean implementation. The underlying problem is that |
Just in case you're wondering: we're currently organizing part of our work in a closed tracker, which is why there hasn't been an issue for this in GH. We'll change that short-term. |
@timcharper is this still an issue or can we close? |
@jasongilanfarr it's still an issue. |
I can reproduce the issue in 1.4.0-rc1 and will post a video documenting it. |
From @meichstedt :
So, consider updating reserved as in |
Decided that this is not a release blocker but should be fixed soon after blockers |
Can confirm that |
I still need to verify if this is enough, or I still need @meichstedt 's patch |
Found and proposed solution for #5142 while working on this |
Another bug found: #5155 |
Found and fixed this: #5163 |
Another one: #5165 |
Cherry-picked and rebased @meichstedt's patch; it still doesn't work. Will look more tomorrow. https://phabricator.mesosphere.com/D488 should be ready to land |
Summary: Require disabled for resident tasks. Fixes #5163. Partially addresses #4137 Test Plan: create resident task. Make it get lost. Ensure that it doesn't come go inactive. Reviewers: aquamatthias, jdef, meichstedt, jenkins Reviewed By: aquamatthias, jdef, meichstedt, jenkins Subscribers: jdef, marathon-team Differential Revision: https://phabricator.mesosphere.com/D488
Found another one: #5207. The solution proposed here will help at least give operators a manual way to recover lost tasks. |
With the fix for #5207 operators are at least given a valid work-around. The primary (only?) cause of this issue will go away with Mesos 1.2.0, slated for release in a few months, which will fix the issue in which agents are assigned a new agentId on host reboot, thereby allowing Mesos to officially declare a task as GONE (interpreted by us as a terminal state, and, therefore, prompts a re-launch). Due to the decreased severity with the other fixes, a valid work-around to get resident tasks running, and, a planned fix for Mesos 1.2.0, I'm inclined to allow this ticket to just get fixed by Mesos 1.2.0 |
The kill while unreachable approach was ultimately too complex. We tried modifying reconciliation to reconcile with the agent ID, and this did not help. We're going to monitor the offer stream and watch for reservations for unreachable tasks, and map them into terminal mesos updates. |
Note: This issue has been migrated to https://jira.mesosphere.com/browse/MARATHON-1713. For more information see https://groups.google.com/forum/#!topic/marathon-framework/khtvf-ifnp8. |
Note: This issue has been migrated to https://jira.mesosphere.com/browse/MARATHON-1713. For more information see https://groups.google.com/forum/#!topic/marathon-framework/khtvf-ifnp8. |
This is a parent issue to aggregate the handful of sub-issues related to resident tasks.
(check indicates it is merged to master. Please see #5206 for the backport to 1.4 status)
-- original --
I've recorded a video to show the problem:
http://screencast.com/t/Lkgdi6tIEG6
In effect, Mesos tells Marathon a task was lost during a reconciliation (for a variety of reasons, but in this demonstrated occurrence it is lost because the mesos-slave id is forcibly changed and a new ID comes up on the same mesos-slave IP address). Then, Marathon responds to that by reserving a new set of resources and persistent volume, and launching a new task.
The expected behavior should be that Marathon should reuse the reserved resources (which it can't because it thinks there is a task running there...
status.state == Unknown
from looking at the protobuf hexdump in zookeeper). If it can't use the reserved resources because it thinks something might be running then it should not launch additional persistent volumes (when push comes to shove, if it can't satisfy 0% over capacity and 0% under capacity thresholds, it should heed the 0% over capacity limit).The text was updated successfully, but these errors were encountered: