Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: NPE when call DockerContainer.State() #540

Closed
kkdeok opened this issue Sep 30, 2022 · 4 comments · Fixed by #635
Closed

[Bug]: NPE when call DockerContainer.State() #540

kkdeok opened this issue Sep 30, 2022 · 4 comments · Fixed by #635
Labels
bug An issue with the library

Comments

@kkdeok
Copy link

kkdeok commented Sep 30, 2022

Testcontainers version

0.13.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host Arch

x86

Go Version

1.18

Docker version

Client: Docker Engine - Community
 Version:           20.10.18
 API version:       1.41
 Go version:        go1.18.6
 Git commit:        b40c2f6
 Built:             Thu Sep  8 23:11:45 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.18
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.6
  Git commit:       e42327a
  Built:            Thu Sep  8 23:09:37 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.8
  GitCommit:        9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
Client:
 Context:    default
 Debug Mode: false
WARNING: No swap limit support
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 88
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1029-aws
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.08GiB
 Name: ip-10-128-46-212
 ID: ZN3Z:JLTC:AMIW:LQ26:7Z6S:IZMY:TDAJ:ELML:JK6W:RPTQ:BTGO:KFMB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: ***
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 6
  Running: 5
  Paused: 0
  Stopped: 1
 Images: 131
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
WARNING: No swap limit support
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1029-aws
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.18GiB
 Name: ip-10-128-78-126
 ID: 4YMG:OD3V:ZOHD:ZCIX:C4WR:KNIQ:W2X7:GLTU:P3OY:CGJD:XUDC:MUHY
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: ***
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

When we call DockerContainer.State() in our test code, the NPE happened. IMHO, It seems because member variable raw is called without initialization and it is possible if inspectRawContainer, which initialize the variable 'raw', return error.

Relevant log output

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xcdacd6]

goroutine 50 [running]:
testing.tRunner.func1.2({0x11cf000, 0x1c42850})
	/home/ubuntu/actions-runner/_work/_tool/go/1.18.0/x64/src/testing/testing.go:1389 +0x24e
testing.tRunner.func1()
	/home/ubuntu/actions-runner/_work/_tool/go/1.18.0/x64/src/testing/testing.go:1392 +0x39f
panic({0x11cf000, 0x1c42850})
	/home/ubuntu/actions-runner/_work/_tool/go/1.18.0/x64/src/runtime/panic.go:838 +0x207
github.com/testcontainers/testcontainers-go.(*DockerContainer).State(0xc00054f040, {0x156e6d0?, 0xc00010e5a0?})
	/root/go/pkg/mod/github.com/testcontainers/testcontainers-go@v0.13.0/docker.go:294 +0x36

Additional Information

No response

@mdelapenya
Copy link
Collaborator

mdelapenya commented Oct 3, 2022

@doubleknd26 I'm not able to reproduce this with the current tests. In #543 I'm directly calling the State func of the container, and I double checked that this new code, but applied on top of v0.13.0, behaves exactly the same.

But looking at the code, the first call to inspectRawContainer that errors, and consequently sets raw to nil should cause that the next call to the State func must panic as you described in this issue:

// update container raw info
func (c *DockerContainer) inspectRawContainer(ctx context.Context) (*types.ContainerJSON, error) {
	inspect, err := c.provider.client.ContainerInspect(ctx, c.ID)
	if err != nil {
		return nil, err
	}
	c.raw = &inspect
	return c.raw, nil
}

// State returns container's running state
func (c *DockerContainer) State(ctx context.Context) (*types.ContainerState, error) {
	inspect, err := c.inspectRawContainer(ctx)
	if err != nil {
		return c.raw.State, err
	}
	return inspect.State, nil
}

I'm tracing back to the root cause, and noticed that the return of the raw value was changed in #271, from nil > c.raw.State. I'm currently trying to identify why that need, and why there has no other complain about that very suspicious condition.

Could you share more about the tests where you found the panic?

@kkdeok
Copy link
Author

kkdeok commented Oct 3, 2022

Could you share more about the tests where you found the panic?

@mdelapenya Hi. It happened intermittently on our Github Action server managed by our company and unfortunately I cannot access the server. But, I'm sure the server doesn't have enough resource like cpu, mem. So I think It is because the container has been deleted unexpectedly.

I tested on my local to check the case. Here is a sample test code. I hope it makes sense to you.

  1. run the test code using debug mode and catch it by break point at L733.
    image

  2. go to State() -> nspectRawContainer() -> provider.client.ContainerInspect() and stop at 18L.
    image

  3. delete container.
    Screen Shot 2022-10-04 at 0 34 37

  4. Step over and confirm error is returned.
    Screen Shot 2022-10-04 at 0 35 04

  5. go back to State() and confirm c.raw is called without init.
    Screen Shot 2022-10-04 at 0 36 11

  6. NPE
    Screen Shot 2022-10-04 at 0 36 50

@mdelapenya
Copy link
Collaborator

Thanks for such detailed report!

That seems correct: at the moment the container exits for any reasons (resource limits?) before the inspectRaw func is called, then any call to State should fail.

I'm still in here:

I'm tracing back to the root cause, and noticed that the return of the raw value was changed in #271, from nil > c.raw.State. I'm currently trying to identify why that need, and why there has no other complain about that very suspicious condition.

It seems that returning nil on error would make sense, although I want to check all possible the implications from that issue.

@settings settings bot removed the type/bug label Oct 26, 2022
@mdelapenya mdelapenya added the bug An issue with the library label Oct 26, 2022
@mdelapenya
Copy link
Collaborator

@doubleknd26 I'm able to reproduce it with this simple test:

func TestContainerStateAfterTermination(t *testing.T) {
	ctx := context.Background()

	nginxA, err := GenericContainer(ctx, GenericContainerRequest{
		ProviderType: providerType,
		ContainerRequest: ContainerRequest{
			Image: nginxAlpineImage,
			ExposedPorts: []string{
				nginxDefaultPort,
			},
		},
		Started: true,
	})
	if err != nil {
		t.Fatal(err)
	}

	err = nginxA.Terminate(ctx)
	if err != nil {
		t.Fatal(err)
	}
	state, err := nginxA.State(ctx)
	assert.Nil(t, state, "expected nil container inspect.")
	assert.Error(t, err, "expected error from container inspect.")
}

The error comes because the State func is called after the container is terminated (probably something the user should know), but the library should be resilient enough to not panic and return a nil JSON state.

I'm elaborating a fix at this moment. Thanks again for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with the library
Projects
None yet
2 participants