-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent failure in CI environment #479
Comments
Hi @stuarthendren, Please check this page: https://www.testcontainers.org/usage/inside_docker.html The error is a pre-flight check and verifies that your environment is correct, nothing special to debug :) |
Thank you for the suggestion but I had already done that. |
@stuarthendren Could you provide your Jenkins pipeline configuration? So you are mounting the Docker socket? |
leaving out the notify function - just a slackSend call. |
Hi, I'm running into exactly same problem, but unfortunately, also still looking for a solution ... Jochen |
@stuarthendren Also we use a build image with installed Docker client in our CI. |
I can only access some old logs from jenkins at the moment, neither of those classes logged anything, I can look at the log level settings and rerun over the weekend. |
Should be loglevel info, see this logback config for sensible defaults: |
Hi @kiview , I'm facing the same problem, so I wanted to report my logs. I added the Logback library on my classpath:
I added the following xml configuration: <configuration>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
</encoder>
</appender>
<root level="info">
<appender-ref ref="STDOUT"/>
</root>
<logger name="org.testcontainers" level="DEBUG"/>
</configuration> But when I run my tests, I get this logging:
|
Ok, I managed to fix the logging, this is the outcome:
So although our Jenkins slave is a Docker container (it's running in an ECS cluster on AWS), the testcontainers code doesn't recognize he's currently working from within an existing Docker container and it's trying to reach out to 'localhost' (which is not valid!) Is it possible there's a problem with the recognition of 'RunInsideDocker'? |
@jochenhebbrecht it seems so. Could you please share your ECS task def? |
@jochenhebbrecht may I kindly ask you to share a JSON definition as text instead? :) You can get it by clicking on "JSON" tab |
When running inside Docker, the Docker host IP address should ideally be something like |
Ha, sorry :) ... Didn't notice the "JSON" tab. Yes, of course :-) {
"executionRoleArn": null,
"containerDefinitions": [
{
"dnsSearchDomains": null,
"logConfiguration": null,
"entryPoint": null,
"portMappings": [],
"command": null,
"linuxParameters": null,
"cpu": 1,
"environment": [
{
"name": "DOCKER_API_VERSION",
"value": "1.27"
},
{
"name": "TRUSTED_SSH_HOSTS",
"value": "bitbucket.MYCOMPANY.com:7999 svn.MYCOMPANY.com subversion.MYCOMPANY.com"
}
],
"ulimits": null,
"dnsServers": null,
"mountPoints": [
{
"readOnly": false,
"containerPath": "/run/docker.sock",
"sourceVolume": "DockerSocket"
},
{
"readOnly": false,
"containerPath": "/home/jenkins/.m2",
"sourceVolume": "MavenRepository"
}
],
"workingDirectory": null,
"dockerSecurityOptions": null,
"memory": 2850,
"memoryReservation": null,
"volumesFrom": [],
"image": "MYREPOID.dkr.ecr.eu-west-1.amazonaws.com/jenkins-cloud-slave:4f4538e",
"disableNetworking": null,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"privileged": false,
"name": "ecs-cloud-jenkins-cloud-slave-3GB",
"expanded": true
}
],
"placementConstraints": [],
"memory": null,
"taskRoleArn": null,
"compatibilities": [
"EC2"
],
"taskDefinitionArn": "arn:aws:ecs:eu-west-1:MYREPOID:task-definition/ecs-cloud-jenkins-cloud-slave-3GB:12",
"family": "ecs-cloud-jenkins-cloud-slave-3GB",
"requiresAttributes": [
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth"
}
],
"requiresCompatibilities": null,
"networkMode": null,
"cpu": null,
"revision": 12,
"status": "ACTIVE",
"volumes": [
{
"name": "DockerSocket",
"host": {
"sourcePath": "/var/run/docker.sock"
}
},
{
"name": "MavenRepository",
"host": {
"sourcePath": "/data/nfs/maven"
}
}
]
} |
@kiview yes, I know, but apparently, the testcontainers code doesn't do that (and returns 'localhost'), although it's pretty unclear to me why ... |
Some extra information: I've added more logging in the DockerClientConfigUtils class:
... and this results in the following logging ...
So the shell command I executed the command
|
@jochenhebbrecht oh, nice, thanks! Thanks for debugging it! Also, if you found how to fix that - that would be highly appreciated (not even talking about the PR :D ) |
I created a shell script on the Jenkins Cloud Slave which executed, in for loop for like 1.000.000 times, the 'ip route' command and it always returned a valid value. So I don't think it's related to ECS-native networking. I found out the LogToStringContainerCallback class doesn't receive the output of the Docker container. I'm afraid there's something wrong with the capturing of the stdout messages (probably in combination with ECS) |
@jochenhebbrecht wow, thanks! Yes, we observed some flakiness with |
@jochenhebbrecht please try |
Thanks, but meanwhile, I could solve the issue myself. This is what I've changed:
I'm now always retrieving the output of the 'ip route' command. I think it's related to the waiting time now being added. |
@jochenhebbrecht that is also a good solution 👍 probably even better than mine :) The key here is Do you want to submit a PR? :) |
Yes, I'll make a PR. I'll keep you updated. |
PR created. All tests are green. |
…ing inside docker (#541) * fix: introduce 'withFollowStream' and 'awaitCompletion' methods to await the completion of one-off command * Fixed retrieval of Docker host IP when running inside Docker (#479) * fix: withStdErr() is not needed - withTailAll() is not needed (default: tail=all) * Update CHANGELOG.md
I updated to 1.5.1 and was not able to reproduce the error. Happy for you to close it. |
... which is weird, because they still need to release version 1.5.2 which will contain the fix? :-) ... |
Just as an update - the error did occur again, but a lot less often than before.
This was just to update, but hoefully the changes in 1.5.2 will fix it anyway. |
We have similar issues on 1.4.3. I've noticed that it happens when there are 2 or more parallel builds on our build server. I've never had this issue when running the builds locally. |
@agibalov 1.4.3 is old, the latest is 1.6.0, please update |
@bsideup upgrading to 1.6.0 fixed the issue for us. Thanks! |
@agibalov errr... FYI the latest now is 1.7.0 😂 |
I have a simple Redis integration test:
This passes every time running locally, however, when running in Jenkins (BlueOcean) it intermittently fails with the exception:
I've tried larger timeout values, but that doesn't seem to help. The docker sock port is exposed to the machine, the logs report lots of activity and the fact it sometimes passes must imply it's not simply that it can't interact with Docker.
I realize that this must be something to do with the environment, maybe as the build is already run in a Docker Container, so I don't expect you to have a fix for this, however, my question is, how do I start to debug such an issue? Maybe there is a way to see the logs from the container?
The text was updated successfully, but these errors were encountered: