Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daemon daedlock w/ containerd communication #35825

Closed
cpuguy83 opened this issue Dec 18, 2017 · 5 comments
Closed

daemon daedlock w/ containerd communication #35825

cpuguy83 opened this issue Dec 18, 2017 · 5 comments

Comments

@cpuguy83
Copy link
Member

CI is failing a lot with this:

15:31:46 ----------------------------------------------------------------------
15:31:46 FAIL: check_test.go:366: DockerSwarmSuite.TearDownTest
15:31:46 
15:31:46 check_test.go:371:
15:31:46     d.Stop(c)
15:31:46 daemon/daemon.go:395:
15:31:46     t.Fatalf("Error while stopping the daemon %s : %v", d.id, err)
15:31:46 ... Error: Error while stopping the daemon d2305ce671729 : exit status 130
15:31:46 
15:31:46 
15:31:46 ----------------------------------------------------------------------
15:31:46 PANIC: docker_api_swarm_service_test.go:201: DockerSwarmSuite.TestAPISwarmServicesUpdateStartFirst
15:31:46 
15:31:46 [d2305ce671729] waiting for daemon to start
15:31:46 [d2305ce671729] daemon started
15:31:46 
15:31:46 [d2305ce671729] daemon started
15:31:46 Attempt #2: daemon is still running with pid 11196
15:31:46 Attempt #3: daemon is still running with pid 11196
15:31:46 Attempt #4: daemon is still running with pid 11196
15:31:46 [d2305ce671729] exiting daemon
15:31:46 ... Panic: Fixture has panicked (see related PANIC)

Here is the relevent daemon log: docker.log

The deadlock appears to be in communicating with containerd.
This could be due changes from #35809

@cpuguy83
Copy link
Member Author

Here's an example failed CI run: https://jenkins.dockerproject.org/job/Docker-PRs-experimental/38454/console

Failing in others as well.

@cpuguy83
Copy link
Member Author

IsRunning() check blocked here:

goroutine 3544 [semacquire]:
sync.runtime_SemacquireMutex(0xc421b20b04, 0xc421932000)
	/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc421b20b00)
	/usr/local/go/src/sync/mutex.go:134 +0xee
github.com/docker/docker/container.(*State).IsRunning(0xc421b20b00, 0xc421a16cc0)
	/go/src/github.com/docker/docker/container/state.go:250 +0x2d
github.com/docker/docker/daemon.(*Daemon).ContainerStop(0xc420408000, 0xc421a16cc0, 0x26, 0x0, 0x14cbf78, 0x0)
	/go/src/github.com/docker/docker/daemon/stop.go:23 +0x84
github.com/docker/docker/daemon/cluster/executor/container.(*containerAdapter).shutdown(0xc421bf7290, 0x7fe898c190b0, 0xc4212dd740, 0x2, 0xc4202d7690)
	/go/src/github.com/docker/docker/daemon/cluster/executor/container/adapter.go:356 +0x9d
github.com/docker/docker/daemon/cluster/executor/container.(*controller).Shutdown(0xc4216334a0, 0x7fe898c190b0, 0xc4212dd740, 0xc420e42400, 0xc4202d7730)
	/go/src/github.com/docker/docker/daemon/cluster/executor/container/controller.go:348 +0xa2
github.com/docker/docker/daemon/cluster/executor/container.(*controller).Remove(0xc4216334a0, 0x7fe898c190b0, 0xc4212dd740, 0x7fe898c190b0, 0xc4212dd740)
	/go/src/github.com/docker/docker/daemon/cluster/executor/container/controller.go:391 +0x77
github.com/docker/docker/vendor/github.com/docker/swarmkit/agent.reconcileTaskState.func1(0xc421b14c60)
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/worker.go:267 +0xa7
created by github.com/docker/docker/vendor/github.com/docker/swarmkit/agent.reconcileTaskState
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/worker.go:295 +0x1105

Which is being blocked by:

goroutine 3068 [select]:
github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*Stream).Header(0xc4216ff440, 0x1de0978, 0xc421c790e0, 0x2bc33a0)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:256 +0x146
github.com/docker/docker/vendor/google.golang.org/grpc.recvResponse(0x7fe898c190b0, 0xc4217012c0, 0xc4200d3d20, 0xc4200d3d40, 0x2bb9b60, 0x2c409c0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:64 +0xac
github.com/docker/docker/vendor/google.golang.org/grpc.invoke(0x7fe898c190b0, 0xc4217012c0, 0x1d8b18a, 0x28, 0x1bf64e0, 0xc421701200, 0x1c580a0, 0x2c409c0, 0xc420419ba0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:279 +0xcfe
github.com/docker/docker/vendor/github.com/containerd/containerd.namespaceInterceptor.unary(0x1d4609e, 0x4, 0x7fe898c15030, 0xc420018028, 0x1d8b18a, 0x28, 0x1bf64e0, 0xc421701200, 0x1c580a0, 0x2c409c0, ...)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/grpc.go:18 +0xf4
github.com/docker/docker/vendor/github.com/containerd/containerd.(namespaceInterceptor).(github.com/docker/docker/vendor/github.com/containerd/containerd.unary)-fm(0x7fe898c15030, 0xc420018028, 0x1d8b18a, 0x28, 0x1bf64e0, 0xc421701200, 0x1c580a0, 0x2c409c0, 0xc420419ba0, 0x1de0950, ...)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/grpc.go:34 +0xf4
github.com/docker/docker/vendor/google.golang.org/grpc.Invoke(0x7fe898c15030, 0xc420018028, 0x1d8b18a, 0x28, 0x1bf64e0, 0xc421701200, 0x1c580a0, 0x2c409c0, 0xc420419ba0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:141 +0xdd
github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/tasks/v1.(*tasksClient).Kill(0xc420f0b970, 0x7fe898c15030, 0xc420018028, 0xc421701200, 0x0, 0x0, 0x0, 0xc421857470, 0xc42162e870, 0x10)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/tasks/v1/tasks.pb.go:468 +0xd2
github.com/docker/docker/vendor/github.com/containerd/containerd.(*task).Kill(0xc421857470, 0x2bba360, 0xc420018028, 0xf, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/task.go:180 +0x206
github.com/docker/docker/libcontainerd.(*client).SignalProcess(0xc420142380, 0x2bba360, 0xc420018028, 0xc42158d100, 0x40, 0x1d45f8e, 0x4, 0xf, 0xc421cd6120, 0xc420e66f00)
	/go/src/github.com/docker/docker/libcontainerd/client_daemon.go:361 +0xd6
github.com/docker/docker/daemon.(*Daemon).kill(0xc420408000, 0xc421c0b8c0, 0xf, 0x2, 0x2)
	/go/src/github.com/docker/docker/daemon/kill.go:179 +0x8c
github.com/docker/docker/daemon.(*Daemon).killWithSignal(0xc420408000, 0xc421c0b8c0, 0xf, 0x0, 0x0)
	/go/src/github.com/docker/docker/daemon/kill.go:99 +0x243
github.com/docker/docker/daemon.(*Daemon).killPossiblyDeadProcess(0xc420408000, 0xc421c0b8c0, 0xf, 0x27, 0xc42158d100)
	/go/src/github.com/docker/docker/daemon/kill.go:169 +0x4c
github.com/docker/docker/daemon.(*Daemon).containerStop(0xc420408000, 0xc421c0b8c0, 0xa, 0x0, 0x0)
	/go/src/github.com/docker/docker/daemon/stop.go:48 +0xaa
github.com/docker/docker/daemon.(*Daemon).ContainerStop(0xc420408000, 0xc421a16e70, 0x26, 0x0, 0x14cbf78, 0x0)
	/go/src/github.com/docker/docker/daemon/stop.go:30 +0xc5
github.com/docker/docker/daemon/cluster/executor/container.(*containerAdapter).shutdown(0xc421bf7290, 0x7fe898c191e0, 0xc4212bbc80, 0x7fe89a46f000, 0x0)
	/go/src/github.com/docker/docker/daemon/cluster/executor/container/adapter.go:356 +0x9d
github.com/docker/docker/daemon/cluster/executor/container.(*controller).Shutdown(0xc4216334a0, 0x7fe898c191e0, 0xc4212bbc80, 0xc420f0b960, 0x2bc2980)
	/go/src/github.com/docker/docker/daemon/cluster/executor/container/controller.go:348 +0xa2
github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/exec.Do(0x7fe898c191e0, 0xc4212bbc80, 0xc421bf9c20, 0x2bc2980, 0xc4216334a0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/exec/controller.go:298 +0x34f
github.com/docker/docker/vendor/github.com/docker/swarmkit/agent.(*taskManager).run.func2(0x7fe898c190b0, 0xc421c0cf30, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/task.go:134 +0xe1
github.com/docker/docker/vendor/github.com/docker/swarmkit/agent.runctx(0x7fe898c190b0, 0xc421c0cf30, 0xc421631380, 0xc420ec5e00, 0xc4212bbd80)
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/helpers.go:9 +0x55
created by github.com/docker/docker/vendor/github.com/docker/swarmkit/agent.(*taskManager).run
	/go/src/github.com/docker/docker/vendor/github.com/docker/swarmkit/agent/task.go:122 +0xf9d

@cpuguy83
Copy link
Member Author

And another blocked goroutine:

goroutine 3061 [select]:
github.com/docker/docker/vendor/google.golang.org/grpc/transport.(*Stream).Header(0xc4216ff560, 0x1de0978, 0xc421643030, 0x2bc33a0)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/transport/transport.go:256 +0x146
github.com/docker/docker/vendor/google.golang.org/grpc.recvResponse(0x7fe898c190b0, 0xc421701440, 0xc4200d3d20, 0xc4200d3d40, 0x2bb9b60, 0x2c409c0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:64 +0xac
github.com/docker/docker/vendor/google.golang.org/grpc.invoke(0x7fe898c190b0, 0xc421701440, 0x1d88db7, 0x27, 0x1bf6320, 0xc421410020, 0x1bf6400, 0xc420f0ba48, 0xc420419ba0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:279 +0xcfe
github.com/docker/docker/vendor/github.com/containerd/containerd.namespaceInterceptor.unary(0x1d4609e, 0x4, 0x7fe898c15030, 0xc420018028, 0x1d88db7, 0x27, 0x1bf6320, 0xc421410020, 0x1bf6400, 0xc420f0ba48, ...)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/grpc.go:18 +0xf4
github.com/docker/docker/vendor/github.com/containerd/containerd.(namespaceInterceptor).(github.com/docker/docker/vendor/github.com/containerd/containerd.unary)-fm(0x7fe898c15030, 0xc420018028, 0x1d88db7, 0x27, 0x1bf6320, 0xc421410020, 0x1bf6400, 0xc420f0ba48, 0xc420419ba0, 0x1de0950, ...)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/grpc.go:34 +0xf4
github.com/docker/docker/vendor/google.golang.org/grpc.Invoke(0x7fe898c15030, 0xc420018028, 0x1d88db7, 0x27, 0x1bf6320, 0xc421410020, 0x1bf6400, 0xc420f0ba48, 0xc420419ba0, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/google.golang.org/grpc/call.go:141 +0xdd
github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/tasks/v1.(*tasksClient).Get(0xc420f0ba40, 0x7fe898c15030, 0xc420018028, 0xc421410020, 0x0, 0x0, 0x0, 0xc4216436e8, 0x64ebf7, 0x1ba3740)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/api/services/tasks/v1/tasks.pb.go:450 +0xd2
github.com/docker/docker/vendor/github.com/containerd/containerd.(*process).Status(0xc4218a0cf0, 0x2bba360, 0xc420018028, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc421410000, ...)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/process.go:201 +0x16d
github.com/docker/docker/vendor/github.com/containerd/containerd.(*process).Delete(0xc4218a0cf0, 0x2bba360, 0xc420018028, 0x0, 0x0, 0x0, 0xc420e1cf60, 0x2bc39a0, 0xc4218a0cf0)
	/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/process.go:178 +0x131
github.com/docker/docker/libcontainerd.(*client).Exec(0xc420142380, 0x2bba3a0, 0xc421cd6000, 0xc42158d100, 0x40, 0xc42163a800, 0x40, 0xc4219e7e10, 0x0, 0xc420e55f40, ...)
	/go/src/github.com/docker/docker/libcontainerd/client_daemon.go:348 +0x5fa
github.com/docker/docker/daemon.(*Daemon).ContainerExecStart(0xc420408000, 0x7fe898c19220, 0xc421cd6000, 0xc42163a800, 0x40, 0x0, 0x0, 0x2b98fa0, 0xc421942c00, 0x2b98fa0, ...)
	/go/src/github.com/docker/docker/daemon/exec.go:245 +0x9ca
github.com/docker/docker/daemon.(*cmdProbe).run(0xc421acaa29, 0x7fe898c19220, 0xc421cd6000, 0xc420408000, 0xc421c0b8c0, 0xc4209f66f8, 0x4a1f76, 0x5a37df48)
	/go/src/github.com/docker/docker/daemon/health.go:94 +0x580
github.com/docker/docker/daemon.monitor.func1(0x2b98f20, 0xc421acaa29, 0x7fe898c19220, 0xc421cd6000, 0xc420408000, 0xc421c0b8c0, 0xc421cd6060, 0xbe8615b230aea5b9, 0x41bc965b6, 0x2c17280)
	/go/src/github.com/docker/docker/daemon/health.go:199 +0xb0
created by github.com/docker/docker/daemon.monitor
	/go/src/github.com/docker/docker/daemon/health.go:197 +0x336

@thaJeztah
Copy link
Member

Looks like a duplicate of #35775 😇

@euank
Copy link
Contributor

euank commented Apr 12, 2018

If this issue is a dupe of the above-referenced containerd issue, I think it can be closed now that recent docker-ce releases include a new enough containerd/runc version to avoid this.

Based on my reading of the stack trace here, it looks like it is that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants