Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in TestIntegrations - Disconnection #11737

Merged
merged 3 commits into from
Apr 6, 2022

Conversation

jakule
Copy link
Contributor

@jakule jakule commented Apr 5, 2022

Add mutex to session.broadcastResult() to fix race condition reported by Go thread sanitizer.

WARNING: DATA RACE
Write at 0x00c0071c46c0 by goroutine 1037:
  runtime.mapdelete_faststr()
      /opt/go/src/runtime/map_faststr.go:300 +0x0
  github.com/gravitational/teleport/lib/srv.(*session).removePartyMember()
      /workspace/lib/srv/sess.go:1286 +0x204
  github.com/gravitational/teleport/lib/srv.(*session).removeParty()
      /workspace/lib/srv/sess.go:1298 +0x19d
  github.com/gravitational/teleport/lib/srv.(*SessionRegistry).leaveSession()
      /workspace/lib/srv/sess.go:355 +0x4e
  github.com/gravitational/teleport/lib/srv.(*party).Close.func1()
      /workspace/lib/srv/sess.go:1683 +0x110
  sync.(*Once).doSlow()
      /opt/go/src/sync/once.go:68 +0x127
  sync.(*Once).Do()
      /opt/go/src/sync/once.go:59 +0x46
  github.com/gravitational/teleport/lib/srv.(*party).Close()
      /workspace/lib/srv/sess.go:1681 +0x6e
  github.com/gravitational/teleport/lib/srv.closeAll()
      /workspace/lib/srv/ctx.go:1071 +0x101
  github.com/gravitational/teleport/lib/srv.(*ServerContext).Close()
      /workspace/lib/srv/ctx.go:834 +0x14a
  github.com/gravitational/teleport/lib/srv/forward.(*Server).handleSessionChannel·dwrap·15()
      /workspace/lib/srv/forward/sshserver.go:804 +0x39
  github.com/gravitational/teleport/lib/srv/forward.(*Server).handleSessionChannel()
      /workspace/lib/srv/forward/sshserver.go:865 +0x12e7
  github.com/gravitational/teleport/lib/srv/forward.(*Server).handleChannel·dwrap·7()
      /workspace/lib/srv/forward/sshserver.go:672 +0x71

Previous read at 0x00c0071c46c0 by goroutine 1399:
  runtime.mapiternext()
      /opt/go/src/runtime/map.go:851 +0x0
  github.com/gravitational/teleport/lib/srv.(*session).broadcastResult()
      /workspace/lib/srv/sess.go:1274 +0x104
  github.com/gravitational/teleport/lib/srv.(*SessionRegistry).broadcastResult()
      /workspace/lib/srv/sess.go:524 +0x144
  github.com/gravitational/teleport/lib/srv.(*session).launch.func4()
      /workspace/lib/srv/sess.go:919 +0x611

Goroutine 1037 (running) created at:
  github.com/gravitational/teleport/lib/srv/forward.(*Server).handleChannel()
      /workspace/lib/srv/forward/sshserver.go:672 +0x1bd
  github.com/gravitational/teleport/lib/srv/forward.(*Server).handleConnection·dwrap·6()
      /workspace/lib/srv/forward/sshserver.go:621 +0x71

Goroutine 1399 (running) created at:
  github.com/gravitational/teleport/lib/srv.(*session).launch()
      /workspace/lib/srv/sess.go:890 +0x717
  github.com/gravitational/teleport/lib/srv.(*session).addParty.func2()
      /workspace/lib/srv/sess.go:1551 +0x64

@@ -1271,6 +1271,9 @@ func sessionsStreamingUploadDir(ctx *ServerContext) string {
}

func (s *session) broadcastResult(r ExecResult) {
s.mu.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can deadlock.

(*SessionRegistry).broadcastResult() acquires the lock and then calls (*session).boradcastResult(), which will attempt to acquire a lock that it already holds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmb3 I don't think this is the case. The fist lock as you pointed out is in SessionRegistry and the one that I've added is in session

teleport/lib/srv/sess.go

Lines 67 to 68 in 7c7bb75

type SessionRegistry struct {
mu sync.Mutex

teleport/lib/srv/sess.go

Lines 530 to 531 in 7c7bb75

type session struct {
mu sync.RWMutex

Alternative solution would be to move the SessionRegistry lock inside (*SessionRegistry).findSessionLocked() as I don't think that is needed during (*session).broadcastResult() call and leave the extra lock that I've added, but I'm not familiar with this part of code.

@jakule jakule enabled auto-merge (squash) April 6, 2022 17:02
@jakule jakule merged commit e2b1e9f into master Apr 6, 2022
@jakule jakule deleted the jakule/fix-race-disconnection-test branch April 6, 2022 18:22
jakule added a commit that referenced this pull request Apr 8, 2022
Fix race condition reported by TestIntegrations - Disconnection
jakule added a commit that referenced this pull request Apr 14, 2022
Fix race condition reported by TestIntegrations - Disconnection

Backport of #11737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants