Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overly-aggressive mutex locks #1122

Closed
markphelps opened this issue Mar 30, 2020 · 5 comments
Closed

Overly-aggressive mutex locks #1122

markphelps opened this issue Mar 30, 2020 · 5 comments

Comments

@markphelps
Copy link
Collaborator

markphelps commented Mar 30, 2020

Mutex locking added in #1047 is too aggressive.

Re this comment:

This lock is far too aggressive. It locks the entire logger, rather than a single entry as I believe it should.

Originally posted by @flimzy in #1047 (comment)

More context:

I looked at it briefly before posting my comment, and it wasn't immediately obvious the proper solution. It looks like we need to add a mutex to the Entry type, and lock it in WithFields and in fireHooks, but my first attempt lead to a deadlock.

While working on that, I ran into what may be other overly-aggressive locks which could be relaxed, too, but it's tricky to do that with confidence, without tests covering all possible races (which are incredibly hard to test).

@xialu4820723
Copy link

xialu4820723 commented Mar 30, 2020

@flimzy is right. I updated dependencies just now, then my service get blocked during
start up. It turns out updating entry in hook would cause a deadlock
WXWorkCapture_15855788086642

@saschagrunert
Copy link

I can confirm this. We updated logrus to 1.5.0 and got a deadlock. Commenting out these lines would resolve the issue:
https://github.com/cri-o/cri-o/blob/5716cbc3a2ae08ab221402fbd0c3f20ebf61d416/internal/log/hook_filename.go#L50-L58

@markphelps
Copy link
Collaborator Author

Ok, I see two main options, please let me know what you all think or if there are any better options:

  1. Revert fix race conditions on entry #1047 as it introduced the aggressive mutex locks and create a new release v1.5.1. This will mean that we reintroduce the bug fixed however: Race condition in entry.Data on calling WithFields() method  #1046

  2. Fix forward by introducing a mutex on the logger.Entry itself, however @flimzy mentioned this lead to deadlocks when trying this approach. @flimzy could you provide some sample code to reproduce this deadlock with moving the mutex to the Entry?

Thoughts?

umohnani8 added a commit to umohnani8/image that referenced this issue Apr 3, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.

Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/image to 5.3.1.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
@saschagrunert
Copy link

Im in favor of moving forward since I think the intention of the original PR was the right one. 👍

umohnani8 added a commit to umohnani8/image that referenced this issue Apr 3, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.

Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/image to 5.3.1.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 3, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 3, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 4, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 6, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 6, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
umohnani8 added a commit to umohnani8/buildah that referenced this issue Apr 6, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
rhatdan pushed a commit to rhatdan/buildah that referenced this issue Apr 6, 2020
The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
bors bot added a commit to containers/buildah that referenced this issue Apr 6, 2020
2274: Downgrade siruspen/logrus from 1.4.2 r=rhatdan a=umohnani8

The upgrade of logrus forces to update the dependency in go module
consumers as well.

The new locking in logrus 1.5.0 introduces regressions in conjunction to
deadlocks when using custom hooks, which is the case in CRI-O for
example.
Having it point to v1.5.0 here is forcing cri-o to upgrade it as well
when we update c/buildah to v1.14.6.

Let's downgrade logrus till sirupsen/logrus#1122
is fixed.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>

Co-authored-by: Urvashi Mohnani <umohnani@redhat.com>
@markphelps
Copy link
Collaborator Author

I chose to revert since no forward fix has been proposed and several users are reporting deadlocks since updating to 1.5.0

If anyone has a potential fix for #1046 please open an PR against it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants