-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix leaking tasks.db #2938
Fix leaking tasks.db #2938
Conversation
/cc @cpuguy83 |
This seems like it makes sense. Has this been tested in the case that a worker is promoted to manager? |
Just saw that failure. Can't reproduce it locally, so I'm not sure what's going on. There shouldn't be any difference in a worker promoted to a manager. The databases for each component are completely separate. |
If I am correct, this pull request won't shrink a huge tasks.db from previous versions. It will just prevent the file from continue growing, is this correct? |
@Thomas131 That is correct. |
cb48309
to
af58e4d
Compare
@Thomas131 I'm not Officially Endorsing this approach, but I've been told that Nothing Breaks if you delete tasks.db. Of course, the Official Advice would be to leave the cluster, delete tasks.db, and then rejoin the cluster as a "new" node. |
af58e4d
to
4b3e5a7
Compare
Codecov Report
@@ Coverage Diff @@
## master #2938 +/- ##
==========================================
+ Coverage 61.71% 61.78% +0.07%
==========================================
Files 142 142
Lines 22994 22998 +4
==========================================
+ Hits 14190 14210 +20
+ Misses 7309 7297 -12
+ Partials 1495 1491 -4 |
I found the race condition. I "fixed" it, but I'm not powerful enough to actually fix it the right way without the whole house of cards tumbling down. |
4b3e5a7
to
b4dd558
Compare
b4dd558
to
8372b96
Compare
8372b96
to
3b3a2ea
Compare
For a long time, it's been a known fault that tasks.db grows out of control. The reason is that this database is only cleaned up, and old tasks removed, during initialization. When an assignment to a worker is removed, previously, the assignment was just marked as removed in the tasks database. However, an assignment is only removed from a worker when the task is removed from the manager. The worker does not need to continue to keep track of the task. Instead of marking a task as no longer assigned, when a task is removed as an assignment, we'll simply delete it from the database. I'm not 100% sure of what all the task database is responsible for, or why it needs to be persisted, so this change is targeted to have the minimal impact on the system. Signed-off-by: Drew Erny <derny@mirantis.com>
3b3a2ea
to
585521d
Compare
One failure; is it a known flaky, or actual related?
|
Known flaky. There are two or three like that. |
Bumps swarmkit vendoring. Includes moby/swarmkit#2938, which fixes tasks.db growing out of control on worker nodes. Signed-off-by: Drew Erny <derny@mirantis.com>
Bumps swarmkit vendoring. Includes moby/swarmkit#2938, which fixes tasks.db growing out of control on worker nodes. Signed-off-by: Drew Erny <derny@mirantis.com> Upstream-commit: 1dbf34f3aab290cb5a9246f54d66d84d59a27cb6 Component: engine
- What I did
For a long time, it's been a known fault that tasks.db grows out of control. The reason is that this database is only cleaned up, and old tasks removed, during initialization.
When an assignment to a worker is removed, previously, the assignment was just marked as removed in the tasks database. However, an assignment is only removed from a worker when the task is removed from the manager. The worker does not need to continue to keep track of the task.
- How I did it
Instead of marking a task as no longer assigned, when a task is removed as an assignment, we'll simply delete it from the database.
I'm not 100% sure of what all the task database is responsible for, or why it needs to be persisted, so this change is targeted to have the minimal impact on the system.
This is sort of like @olljanat's fix in #2917, but instead of running a routine in a loop, we just delete tasks on demand.
- How to test it
Updates the worker test to reflect the new behavior.
- Description for the changelog
Fix longstanding issue with tasks.db growing out of control.