This repository has been archived by the owner on Oct 13, 2023. It is now read-only.
forked from moby/moby
-
Notifications
You must be signed in to change notification settings - Fork 424
[18.09] backport "daemon.ContainerLogs(): fix resource leak on follow" #48
Merged
andrewhsu
merged 7 commits into
docker-archive:18.09
from
kolyshkin:18.09-backport-logs-follow
Sep 13, 2018
Merged
[18.09] backport "daemon.ContainerLogs(): fix resource leak on follow" #48
andrewhsu
merged 7 commits into
docker-archive:18.09
from
kolyshkin:18.09-backport-logs-follow
Sep 13, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In case of errors, the file descriptor is never closed. Fix it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 88bcf15)
There is no need to wait for up to 200ms in order to close the file descriptor once the chClose is received. This commit might reduce the chances for occasional "The process cannot access the file because it is being used by another process" error on Windows, where an opened file can't be removed. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit dfbb64e)
The code in Close() that removes the watches was not working, because it first sets `w.closed = true` and then calls w.close(), which starts with ``` if w.closed { return errPollerClosed } ``` Fix by setting w.closed only after calling w.remove() for all the files being watched. While at it, remove the duplicated `delete(w.watches, name)` code. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit fffa895)
This code has many return statements, for some of them the "end logs" or "end stream" message was not printed, giving the impression that this "for" loop never ended. Make sure that "begin logs" is to be followed by "end logs". Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 2e4c2a6)
This test case checks that followLogs() exits once the reader is gone. Currently it does not (i.e. this test is supposed to fail) due to moby#37391. [kolyshkin@: test case Brian Goff, changelog and all bugs are by me] Source: https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit d37a11b)
When daemon.ContainerLogs() is called with options.follow=true (as in "docker logs --follow"), the "loggerutils.followLogs()" function never returns (even then the logs consumer is gone). As a result, all the resources associated with it (including an opened file descriptor for the log file being read, two FDs for a pipe, and two FDs for inotify watch) are never released. If this is repeated (such as by running "docker logs --follow" and pressing Ctrl-C a few times), this results in DoS caused by either hitting the limit of inotify watches, or the limit of opened files. The only cure is daemon restart. Apparently, what happens is: 1. logs producer (a container) is gone, calling (*LogWatcher).Close() for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175). 2. WatchClose() is properly handled by a dedicated goroutine in followLogs(), cancelling the context. 3. Upon receiving the ctx.Done(), the code in followLogs() (daemon/logger/loggerutils/logfile.go#L626-L638) keeps to send messages _synchronously_ (which is OK for now). 4. Logs consumer is gone (Ctrl-C is pressed on a terminal running "docker logs --follow"). Method (*LogWatcher).Close() is properly called (see daemon/logs.go:114). Since it was called before and due to to once.Do(), nothing happens (which is kinda good, as otherwise it will panic on closing a closed channel). 5. A goroutine (see item 3 above) keeps sending log messages synchronously to the logWatcher.Msg channel. Since the channel reader is gone, the channel send operation blocks forever, and resource cleanup set up in defer statements at the beginning of followLogs() never happens. Alas, the fix is somewhat complicated: 1. Distinguish between close from logs producer and logs consumer. To that effect, - yet another channel is added to LogWatcher(); - {Watch,}Close() are renamed to {Watch,}ProducerGone(); - {Watch,}ConsumerGone() are added; *NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed in order to stop ConsumerLogs(follow=true) when a container is stopped; otherwise we're not interested in it. In other words, we're only using it in followLogs(). 2. Code that was doing (logWatcher*).Close() is modified to either call ProducerGone() or ConsumerGone(), depending on the context. 3. Code that was waiting for WatchClose() is modified to wait for either ConsumerGone() or ProducerGone(), or both, depending on the context. 4. followLogs() are modified accordingly: - context cancellation is happening on WatchProducerGone(), and once it's received the FileWatcher is closed and waitRead() returns errDone on EOF (i.e. log rotation handling logic is disabled); - due to this, code that was writing synchronously to logWatcher.Msg can be and is removed as the code above it handles this case; - function returns once ConsumerGone is received, freeing all the resources -- this is the bugfix itself. While at it, 1. Let's also remove the ctx usage to simplify the code a bit. It was introduced by commit a69a59f ("Decouple removing the fileWatcher from reading") in order to fix a bug. The bug was actually a deadlock in fsnotify, and the fix was just a workaround. Since then the fsnofify bug has been fixed, and a new fsnotify was vendored in. For more details, please see moby#27782 (comment) 2. Since `(*filePoller).Close()` is fixed to remove all the files being watched, there is no need to explicitly call fileWatcher.Remove(name) anymore, so get rid of the extra code. Should fix moby#37391 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 916eabd)
This should test that - all the messages produced are delivered (i.e. not lost) - followLogs() exits Loosely based on the test having the same name by Brian Goff, see https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit f845d76)
thaJeztah
approved these changes
Sep 10, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
andrewhsu
approved these changes
Sep 13, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backports of moby#37734 and moby#37576
Clean cherry-pick, no issues.