Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test failure in dev branch #669

Closed
nmeyerhans opened this issue Jan 17, 2017 · 4 comments
Closed

Unit test failure in dev branch #669

nmeyerhans opened this issue Jan 17, 2017 · 4 comments
Assignees

Comments

@nmeyerhans
Copy link
Contributor

Appveyor is reporting the following error. I can reproduce this with make test-in-docker targeting the dev branch:

panic: Fail in goroutine after TestTaskTransitionWhenStopContainerReturnsUnretriableError has completed
goroutine 1277 [running]:
panic(0x81ef00, 0xc0424024f0)
C:/go/src/runtime/panic.go:500 +0x1af
testing.(*common).Fail(0xc042770240)
C:/go/src/testing/testing.go:412 +0x126
testing.(*common).FailNow(0xc042770240)
C:/go/src/testing/testing.go:431 +0x32
testing.(*common).Fatalf(0xc042770240, 0x8cfed3, 0x24, 0xc0423de1b0, 0x3, 0x3)
C:/go/src/testing/testing.go:496 +0x8a
github.com/aws/amazon-ecs-agent/agent/vendor/github.com/golang/mock/gomock.(*Controller).Call(0xc0420afe40, 0x8a0b60, 0xc04275f2e0, 0x8c34a4, 0xd, 0xc0424d7ca0, 0x2, 0x2, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/vendor/github.com/golang/mock/gomock/controller.go:113 +0x459
github.com/aws/amazon-ecs-agent/agent/engine.(*MockDockerClient).StopContainer(0xc04275f2e0, 0x8c25dc, 0xb, 0x6fc23ac00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/engine_mocks.go:308 +0x184
github.com/aws/amazon-ecs-agent/agent/engine.(*DockerTaskEngine).stopContainer(0xc0421ca7e0, 0xc04275d0e0, 0xc042776000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/docker_task_engine.go:626 +0x293
github.com/aws/amazon-ecs-agent/agent/engine.(*DockerTaskEngine).(github.com/aws/amazon-ecs-agent/agent/engine.stopContainer)-fm(0xc04275d0e0, 0xc042776000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/docker_task_engine.go:675 +0x87
github.com/aws/amazon-ecs-agent/agent/engine.tryApplyTransition(0xc04275d0e0, 0xc042776000, 0x4, 0xc042402380, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/docker_task_engine.go:462 +0x86
github.com/aws/amazon-ecs-agent/agent/engine.(*DockerTaskEngine).applyContainerState(0xc0421ca7e0, 0xc04275d0e0, 0xc042776000, 0xc000000004, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/docker_task_engine.go:688 +0x257
github.com/aws/amazon-ecs-agent/agent/engine.(*DockerTaskEngine).transitionContainer(0xc0421ca7e0, 0xc04275d0e0, 0xc042776000, 0x4)
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/docker_task_engine.go:701 +0x6f
created by github.com/aws/amazon-ecs-agent/agent/engine.(*managedTask).handleContainerChange.func1
c:/gopath/src/github.com/aws/amazon-ecs-agent/agent/engine/task_manager.go:232 +0xbc
FAIL github.com/aws/amazon-ecs-agent/agent/engine 1.292s

@aaithal
Copy link
Contributor

aaithal commented Jan 17, 2017

I was able to repro this from your branch. However, when I modified the target to only run tests in the engine package, it succeeded. So I think this test is exposing a latent bug in our test code.

$ git diff
diff --git a/Makefile b/Makefile
index 8e4ba2f..ea04986 100644
--- a/Makefile
+++ b/Makefile
@@ -63,7 +63,7 @@ misc/certs/ca-certificates.crt:
        docker run "amazon/amazon-ecs-agent-cert-source:make" cat /etc/ssl/certs/ca-certificates.crt > misc/certs/ca-certificates.crt

 test:
-       . ./scripts/shared_env && go test -timeout=25s -v -cover $(shell go list ./agent/... | grep -v /vendor/)
+       . ./scripts/shared_env && go test -timeout=25s -v -cover $(shell go list ./agent/engine/... | grep -v /vendor/)

 benchmark-test:
        . ./scripts/shared_env && go test -run=XX -bench=. $(shell go list ./agent/... | grep -v /vendor/)

nmeyerhans pushed a commit to nmeyerhans/amazon-ecs-agent that referenced this issue Jan 17, 2017
nmeyerhans pushed a commit to nmeyerhans/amazon-ecs-agent that referenced this issue Jan 17, 2017
@aaithal
Copy link
Contributor

aaithal commented Jan 17, 2017

Added some logs and it seems like there's a race condition in the event stream interaction in this test, specifically in https://github.com/aws/amazon-ecs-agent/blob/dev/agent/engine/docker_task_engine_test.go#L852 I'll keep debugging this more. @nmeyerhans, don't consider this as a blocker for #622, this can be chalked to a flaky unit test.

petderek pushed a commit to petderek/amazon-ecs-agent that referenced this issue Jan 17, 2017
@aaithal
Copy link
Contributor

aaithal commented Jan 17, 2017

I think this is indeed a "bug" in the this test. I've added more logs and the panic shows up due an additional/unexpected StopContianer call during the execution of the test. Working on a fix for this.

aaithal added a commit to aaithal/amazon-ecs-agent that referenced this issue Jan 17, 2017
I think this originated from copying test code from a unit test
that already had this bug because of the race condition. The deleted
line of code used to result in race condition and it would trigger
the task engine to do additional work after the test would complete
leading to test failures.

Should fix aws#669
aaithal added a commit to aaithal/amazon-ecs-agent that referenced this issue Jan 18, 2017
I think this originated from copying test code from a unit test
that already had this bug because of the race condition. The deleted
line of code used to result in race condition and it would trigger
the task engine to do additional work after the test would complete
leading to test failures.

Should fix aws#669
aaithal added a commit that referenced this issue Jan 25, 2017
* Fix event state race condition in task engine unit tests.

I think this originated from copying test code from a unit test
that already had this bug because of the race condition. The deleted
line of code used to result in race condition and it would trigger
the task engine to do additional work after the test would complete
leading to test failures.

Should fix #669

* Vendor gomock with support for MinTimes

* Add documentation for using MinTimes in TestTaskTransitionWhenStopContainerReturnsUnretriableError test
@aaithal
Copy link
Contributor

aaithal commented Feb 2, 2017

Resolving as #673 has been merged.

@aaithal aaithal closed this as completed Feb 2, 2017
jwerak pushed a commit to appuri/amazon-ecs-agent that referenced this issue Jun 8, 2017
* Fix event state race condition in task engine unit tests.

I think this originated from copying test code from a unit test
that already had this bug because of the race condition. The deleted
line of code used to result in race condition and it would trigger
the task engine to do additional work after the test would complete
leading to test failures.

Should fix aws#669

* Vendor gomock with support for MinTimes

* Add documentation for using MinTimes in TestTaskTransitionWhenStopContainerReturnsUnretriableError test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants