Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

likely data race in NewSwarm's use of context #189

Closed
mvdan opened this issue Mar 31, 2020 · 3 comments · Fixed by #190
Closed

likely data race in NewSwarm's use of context #189

mvdan opened this issue Mar 31, 2020 · 3 comments · Fixed by #190
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@mvdan
Copy link

mvdan commented Mar 31, 2020

WARNING: DATA RACE
Read at 0x00c000a32748 by goroutine 137:
  github.com/libp2p/go-libp2p-swarm.(*Swarm).teardown()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p-swarm@v0.2.2/swarm.go:122 +0x5c
  github.com/libp2p/go-libp2p-swarm.(*Swarm).teardown-fm()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p-swarm@v0.2.2/swarm.go:118 +0x41
  github.com/jbenet/goprocess.(*process).doClose()
      /home/mvdan/go/pkg/mod/github.com/jbenet/goprocess@v0.1.3/impl-mutex.go:233 +0x40e
  github.com/jbenet/goprocess.(*process).Close()
      /home/mvdan/go/pkg/mod/github.com/jbenet/goprocess@v0.1.3/impl-mutex.go:175 +0x114
  github.com/jbenet/goprocess/context.CloseAfterContext.func1()
      /home/mvdan/go/pkg/mod/github.com/jbenet/goprocess@v0.1.3/context/context.go:67 +0x14d

Previous write at 0x00c000a32748 by goroutine 17:
  github.com/libp2p/go-libp2p-swarm.NewSwarm()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p-swarm@v0.2.2/swarm.go:113 +0x734
  github.com/libp2p/go-libp2p/config.(*Config).NewNode()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p@v0.6.0/config/config.go:118 +0x360
[... our test and code that calls NewNode synchronously]
  testing.tRunner()
      /home/mvdan/tip/src/testing/testing.go:992 +0x1eb

Goroutine 137 (running) created at:
  github.com/jbenet/goprocess/context.CloseAfterContext()
      /home/mvdan/go/pkg/mod/github.com/jbenet/goprocess@v0.1.3/context/context.go:64 +0xb9
  github.com/jbenet/goprocess/context.WithContextAndTeardown()
      /home/mvdan/go/pkg/mod/github.com/jbenet/goprocess@v0.1.3/context/context.go:28 +0x208
  github.com/libp2p/go-libp2p-swarm.NewSwarm()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p-swarm@v0.2.2/swarm.go:112 +0x648
  github.com/libp2p/go-libp2p/config.(*Config).NewNode()
      /home/mvdan/go/pkg/mod/github.com/libp2p/go-libp2p@v0.6.0/config/config.go:118 +0x360
[... our test and code that calls NewNode synchronously; same line as before]
  testing.tRunner()
      /home/mvdan/tip/src/testing/testing.go:992 +0x1eb

Goroutine 17 (running) created at:
  testing.(*T).Run()
      /home/mvdan/tip/src/testing/testing.go:1043 +0x660
  testing.runTests.func1()
      /home/mvdan/tip/src/testing/testing.go:1312 +0xa6
  testing.tRunner()
      /home/mvdan/tip/src/testing/testing.go:992 +0x1eb
  testing.runTests()
      /home/mvdan/tip/src/testing/testing.go:1310 +0x57f
  testing.(*M).Run()
      /home/mvdan/tip/src/testing/testing.go:1220 +0x3a6
  main.main()
      _testmain.go:43 +0x219

This is a partial trace, because I assume it's enough for you to realise what's going on. My assumption is that, in the lines below in swarm.go, it's possible to read s.ctx at the same time as it's being set:

	s.proc = goprocessctx.WithContextAndTeardown(ctx, s.teardown)
	s.ctx = goprocessctx.OnClosingContext(s.proc) // NOTE: racy write

	return s
}

func (s *Swarm) teardown() error {
	// Wait for the context to be canceled.
	// This allows other parts of the swarm to detect that we're shutting
	// down.
	<-s.ctx.Done() // NOTE: racy read

I haven't been able to craft a main.go that reproduces the race reliably, unfortunately. This only happens as part of a non-trivial test. I can provide more details privately, if you prefer.

Continuing with my assumptions - I assume we should be setting s.ctx before it's ever used in any way. I don't know the goprocessctx package well, though. At the end of the test, we call host.Close and cancel on the context used in libp2p.Config.NewNode, in that order. Perhaps that's wrong, given #59.

@mvdan
Copy link
Author

mvdan commented Mar 31, 2020

After using context.Background() in the constructor, in favor of the Close method alone for cancellation, the race seems to have gone away. Perhaps this falls under "working as intended", but it still got me stuck for a while.

@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label Mar 31, 2020
Stebalien added a commit that referenced this issue Mar 31, 2020
Otherwise, we can modify the context after/while the process is shutting down.

fixes #189
@mvdan
Copy link
Author

mvdan commented Mar 31, 2020

Thanks for the quick fix! You might want to check the other libp2p repositories for similar lines via grep, because other similar data races might be lurking.

@Stebalien
Copy link
Member

Yeah, it looks like we have this issue in go-libp2p itself as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants