weird behavior during log wait condition on during container start #344

jgangemi · 2015-12-03T01:54:25Z

so this is a fun one...

it seems like every so often, something is interrupting the stream returned from docker while waiting on a log condition. the annoying things is, i have no real test case that can easily reproduce this.

it seems to occur at random during a rather large monolithic build. each project that needs the container starts one, which in turns causes a bunch of sql scripts to run which take ~30s or so.

basically the read call is blocking on data it expects to be there, but isn't. i'm kind of at a loss as to what could be causing this b/c it just fails at random times on an operation that has been successful before. when running w/ the -X option, there it's also random as to where in the logs the stream cuts out.

any thoughts on this? i'm gonna play around in the logging code tomorrow and turn up docker's logging to see if it will tell me anything useful.

The text was updated successfully, but these errors were encountered:

rhuss · 2016-01-20T11:05:11Z

@jgangemi could it be related to the concurrency issue we have in which in #259 ?

jgangemi · 2016-01-28T16:14:23Z

i think this problem was caused by a corrupt docker-machine instead b/c after re-creating it, the problem went away.

i do, however, have a different problem now related to stopping the container. stop will be issued and then the plugin will hang. i think it's waiting for the docker daemon to send something back that never comes.

[DEBUG] http-outgoing-5 << "HTTP/1.1 204 No Content[\r][\n]"
[DEBUG] http-outgoing-5 << "Server: Docker/1.9.1 (linux)[\r][\n]"
[DEBUG] http-outgoing-5 << "Date: Thu, 28 Jan 2016 15:42:40 GMT[\r][\n]"
[DEBUG] http-outgoing-5 << "[\r][\n]"
[DEBUG] http-outgoing-5 << HTTP/1.1 204 No Content
[DEBUG] http-outgoing-5 << Server: Docker/1.9.1 (linux)
[DEBUG] http-outgoing-5 << Date: Thu, 28 Jan 2016 15:42:40 GMT
[DEBUG] Connection can be kept alive indefinitely
[DEBUG] Connection [id: 5][route: {}->unix://127.0.0.1:1] can be kept alive indefinitely
[DEBUG] Connection released: [id: 5][route: {}->unix://127.0.0.1:1][total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 20]

and this is what i get when it hangs:

[DEBUG] Connection request: [route: {}->unix://127.0.0.1:1][total kept alive: 0; route allocated: 0 of 100; total allocated: 0 of 20]
[DEBUG] Connection leased: [id: 35][route: {}->unix://127.0.0.1:1][total kept alive: 0; route allocated: 1 of 100; total allocated: 1 of 20]
[DEBUG] Opening connection {}->unix://127.0.0.1:1
[DEBUG] Connecting to 0.0.0.0/0.0.0.0:1
[DEBUG] Connection established [family=PF_UNIX path=sock]<->[family=PF_UNIX path=/var/run/docker.sock]
[DEBUG] Executing request POST /v1.18/containers/5c30d1ac839c/stop?t=30 HTTP/1.1
[DEBUG] Target auth state: UNCHALLENGED
[DEBUG] Proxy auth state: UNCHALLENGED
[DEBUG] http-outgoing-35 >> POST /v1.18/containers/5c30d1ac839c/stop?t=30 HTTP/1.1
[DEBUG] http-outgoing-35 >> Accept: */*
[DEBUG] http-outgoing-35 >> Content-Type: application/json; charset=utf-8
[DEBUG] http-outgoing-35 >> Content-Length: 0
[DEBUG] http-outgoing-35 >> Host: 127.0.0.1:1
[DEBUG] http-outgoing-35 >> Connection: Keep-Alive
[DEBUG] http-outgoing-35 >> User-Agent: Apache-HttpClient/4.3.6 (java 1.5)
[DEBUG] http-outgoing-35 >> Accept-Encoding: gzip,deflate
[DEBUG] http-outgoing-35 >> "POST /v1.18/containers/5c30d1ac839c/stop?t=30 HTTP/1.1[\r][\n]"
[DEBUG] http-outgoing-35 >> "Accept: */*[\r][\n]"
[DEBUG] http-outgoing-35 >> "Content-Type: application/json; charset=utf-8[\r][\n]"
[DEBUG] http-outgoing-35 >> "Content-Length: 0[\r][\n]"
[DEBUG] http-outgoing-35 >> "Host: 127.0.0.1:1[\r][\n]"
[DEBUG] http-outgoing-35 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] http-outgoing-35 >> "User-Agent: Apache-HttpClient/4.3.6 (java 1.5)[\r][\n]"
[DEBUG] http-outgoing-35 >> "Accept-Encoding: gzip,deflate[\r][\n]"
[DEBUG] http-outgoing-35 >> "[\r][\n]"
[DEBUG] http-outgoing-34 << "8[\r][\n]"
[DEBUG] http-outgoing-34 << "[0x1][0x0][0x0][0x0][0x0][0x0][0x0]'[\r][\n]"
[DEBUG] http-outgoing-34 << "27[\r][\n]"
[DEBUG] http-outgoing-34 << "2016-01-28T15:57:10.338038105Z .......[\n]"
[DEBUG] http-outgoing-34 << "[\r][\n]"
[DEBUG] http-outgoing-34 << "HTTP/1.1 204 No Content[\r][\n]"
[DEBUG] http-outgoing-34 << "Server: Docker/1.9.1 (linux)[\r][\n]"
[DEBUG] http-outgoing-34 << "Date: Thu, 28 Jan 2016 15:57:10 GMT[\r][\n]"
[DEBUG] http-outgoing-34 << "[\r][\n]"

i haven't figured out if this is a docker issue or a plugin issue.

initially i thought it was a problem w/ it taking too long for docker to kill the container, so i added kill to the wait configuration (telling it to wait 30 seconds) but that didn't seem to resolve the issue.

thoughts?

rhuss · 2016-01-28T16:31:12Z

Sorry, no idea here so far. Could you create a thread dump when it hangs in order to see where the plugin is currently ?

jgangemi · 2016-01-28T16:33:30Z

maybe - let me see if i can get it to hang outside of jenkins and i can hit it w/ jstack

jgangemi · 2016-01-28T16:49:26Z

it looks like httpclient is waiting for input that is never going to come

"main" #1 prio=5 os_prio=0 tid=0x00007f306c00a000 nid=0x1caa runnable [0x00007f307271d000]
   java.lang.Thread.State: RUNNABLE
    at com.kenai.jffi.Foreign.invokeN3O1(Native Method)
    at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
    at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source)
    at jnr.enxio.channels.Native.read(Native.java:95)
    at jnr.enxio.channels.NativeSocketChannel.read(NativeSocketChannel.java:70)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
    - locked <0x0000000789e87e70> (a java.lang.Object)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
    - locked <0x0000000789e8a0d0> (a sun.nio.ch.ChannelInputStream)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at org.apache.http.impl.conn.LoggingInputStream.read(LoggingInputStream.java:87)
    at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:136)
    at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:152)
    at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:270)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
    at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:161)
    at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:153)
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:254)
    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
    at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
    at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:214)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:160)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:136)
    at org.jolokia.docker.maven.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:83)
    at org.jolokia.docker.maven.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:89)
    at org.jolokia.docker.maven.access.hc.DockerAccessWithHcClient.stopContainer(DockerAccessWithHcClient.java:183)
    at org.jolokia.docker.maven.service.RunService.shutdown(RunService.java:372)
    at org.jolokia.docker.maven.service.RunService.stopStartedContainers(RunService.java:157)
    - locked <0x0000000787efa388> (a org.jolokia.docker.maven.service.ContainerTracker)
    at org.jolokia.docker.maven.StopMojo.executeInternal(StopMojo.java:44)
    at org.jolokia.docker.maven.AbstractDockerMojo.execute(AbstractDockerMojo.java:164)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
    at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
    at org.apache.maven.cli.MavenCli.execute(MavenCli.java:862)
    at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:286)
    at org.apache.maven.cli.MavenCli.main(MavenCli.java:197)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)

i wonder if there's a way to force httpclient to timeout while waiting (which currently does not happen).

jgangemi · 2016-01-29T17:20:14Z

this is the output from a successful stop

[DEBUG] http-outgoing-2 >> Accept: */*
[DEBUG] http-outgoing-2 >> Content-Type: application/json; charset=utf-8
[DEBUG] http-outgoing-2 >> Content-Length: 0
[DEBUG] http-outgoing-2 >> Host: 127.0.0.1:1
[DEBUG] http-outgoing-2 >> Connection: Keep-Alive
[DEBUG] http-outgoing-2 >> User-Agent: Apache-HttpClient/4.3.6 (java 1.5)
[DEBUG] http-outgoing-2 >> Accept-Encoding: gzip,deflate
[DEBUG] http-outgoing-2 >> "POST /v1.18/containers/377cd265e73d/stop?t=30 HTTP/1.1[\r][\n]"
[DEBUG] http-outgoing-2 >> "Accept: */*[\r][\n]"
[DEBUG] http-outgoing-2 >> "Content-Type: application/json; charset=utf-8[\r][\n]"
[DEBUG] http-outgoing-2 >> "Content-Length: 0[\r][\n]"
[DEBUG] http-outgoing-2 >> "Host: 127.0.0.1:1[\r][\n]"
[DEBUG] http-outgoing-2 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] http-outgoing-2 >> "User-Agent: Apache-HttpClient/4.3.6 (java 1.5)[\r][\n]"
[DEBUG] http-outgoing-2 >> "Accept-Encoding: gzip,deflate[\r][\n]"
[DEBUG] http-outgoing-2 >> "[\r][\n]"
[DEBUG] http-outgoing-1 << "0[\r][\n]"
[DEBUG] http-outgoing-1 << "[\r][\n]"
[DEBUG] http-outgoing-2 << "HTTP/1.1 204 No Content[\r][\n]"
[DEBUG] http-outgoing-2 << "Server: Docker/1.9.1 (linux)[\r][\n]"
[DEBUG] http-outgoing-2 << "Date: Fri, 29 Jan 2016 01:33:35 GMT[\r][\n]"
[DEBUG] http-outgoing-2 << "[\r][\n]"
[DEBUG] http-outgoing-2 << HTTP/1.1 204 No Content
[DEBUG] http-outgoing-2 << Server: Docker/1.9.1 (linux)
[DEBUG] http-outgoing-2 << Date: Fri, 29 Jan 2016 01:33:35 GMT
[DEBUG] Connection can be kept alive indefinitely
[DEBUG] Connection [id: 2][route: {}->unix://127.0.0.1:1] can be kept alive indefinitely
[DEBUG] Connection released: [id: 2][route: {}->unix://127.0.0.1:1][total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 20]
[DEBUG] DOCKER> Shutdown: Wait 30 s after stopping and before killing container

so my current thought is the container is talking too long to shutdown and the .s ubuntu/upstart is outputting to let you know something is still happening is causing this behavior.

i'm still digging deeper

brian98point6 · 2016-04-29T23:13:26Z

I'm running into same httpclient issue too. In my case, it's stuck on a GET to unix://127.0.0.1:1/v1.18/containers/23fb3c95d18c/json

"main@1" prio=5 tid=0x1 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
      at com.kenai.jffi.Foreign.invokeN3O1(Foreign.java:-1)
      at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
      at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source:-1)
      at jnr.enxio.channels.Native.read(Native.java:95)
      at jnr.enxio.channels.NativeSocketChannel.read(NativeSocketChannel.java:70)
      at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
      - locked <0x16d9> (a java.lang.Object)
      at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
      at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
      - locked <0x16c8> (a sun.nio.ch.ChannelInputStream)
      at java.io.FilterInputStream.read(FilterInputStream.java:133)
      at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:136)
      at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:152)
      at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:270)
      at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
      at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
      at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
      at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:161)
      at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:153)
      at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
      at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
      at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:254)
      at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
      at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
      at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
      at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
      at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
      at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:214)
      at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:160)
      at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:136)
      at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.get(ApacheHttpClientDelegate.java:48)
      at io.fabric8.maven.docker.access.hc.DockerAccessWithHcClient.inspectContainer(DockerAccessWithHcClient.java:234)
      at io.fabric8.maven.docker.service.QueryService.getContainer(QueryService.java:45)
      at io.fabric8.maven.docker.StartMojo.exposeContainerProps(StartMojo.java:299)
      at io.fabric8.maven.docker.StartMojo.executeInternal(StartMojo.java:121)
      - locked <0x16b8> (a io.fabric8.maven.docker.StartMojo)
      at io.fabric8.maven.docker.AbstractDockerMojo.execute(AbstractDockerMojo.java:164)
      at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
      at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
      at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
      at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
      at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
      at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
      at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
      at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
      at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
      at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
      at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
      at org.apache.maven.cli.MavenCli.execute(MavenCli.java:862)
      at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:286)
      at org.apache.maven.cli.MavenCli.main(MavenCli.java:197)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:497)
      at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
      at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
      at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
      at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)

This is running on an EC2 instance, Ubuntu 14.04
Docker version 1.7.0, build 0baf609
Apache Maven 3.3.3
Java version: 1.8.0_45
docker-maven-plugin: 0.15.0

Please let me know if there is any other information which will help.

brian98point6 · 2016-04-30T00:28:55Z

Here is the tail-end of maven-debug logs I got from a different repro.

[DEBUG] http-outgoing-2 >> GET /v1.18/images/rafastealth%2Fkb-service%3A0.1-SNAPSHOT-4ac4f550d1f5d038/json HTTP/1.1
[DEBUG] http-outgoing-2 >> Accept: */*
[DEBUG] http-outgoing-2 >> Content-Type: application/json; charset=utf-8
[DEBUG] http-outgoing-2 >> Host: 127.0.0.1:1
[DEBUG] http-outgoing-2 >> Connection: Keep-Alive
[DEBUG] http-outgoing-2 >> User-Agent: Apache-HttpClient/4.3.6 (java 1.5)
[DEBUG] http-outgoing-2 >> Accept-Encoding: gzip,deflate
[DEBUG] http-outgoing-2 >> "GET /v1.18/images/rafastealth%2Fkb-service%3A0.1-SNAPSHOT-4ac4f550d1f5d038/json HTTP/1.1[\r][\n]"
[DEBUG] http-outgoing-2 >> "Accept: */*[\r][\n]"
[DEBUG] http-outgoing-2 >> "Content-Type: application/json; charset=utf-8[\r][\n]"
[DEBUG] http-outgoing-2 >> "Host: 127.0.0.1:1[\r][\n]"
[DEBUG] http-outgoing-2 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] http-outgoing-2 >> "User-Agent: Apache-HttpClient/4.3.6 (java 1.5)[\r][\n]"
[DEBUG] http-outgoing-2 >> "Accept-Encoding: gzip,deflate[\r][\n]"
[DEBUG] http-outgoing-2 >> "[\r][\n]"
[DEBUG] http-outgoing-1 << "HTTP/1.1 200 OK[\r][\n]"
[DEBUG] http-outgoing-1 << "Content-Type: application/json[\r][\n]"
[DEBUG] http-outgoing-1 << "Date: Fri, 29 Apr 2016 18:10:34 GMT[\r][\n]"
[DEBUG] http-outgoing-1 << "Content-Length: 1739[\r][\n]"
[DEBUG] http-outgoing-1 << "[\r][\n]"

What's interesting to me is that the request is made in one thread, and received in another thread (in the case http-outgoing-2 vs. http-outgoing-1). I can see it is the case with the OP's example of a failed POST, where the GET is made on http-outgoing-35 and the 200 OK is received on http-outgoing-34. In the OP's example of a successful POST, the same thread makes the request and receives the response.

Not sure if this is pertinent information...

rhuss · 2016-05-03T04:24:14Z

I think this is normal behaviour of Apache's HcClient to send the request on one thread and receive it on another. At least this is what I alway see for every request.

rhuss · 2016-05-03T04:26:36Z

Interesting in your logs are though that the server responds with a content-length of 1739 but doesn't continue to send sth (as it seems). This could have two reasons: Either the client is not reading (but from the stacktrace it is blocking for this read, so dont think that is the case) or the server is not sending (which I think is more likely). IMO this here is an issue on the Docker daemon side, could you check the logs there, too ?

rhuss · 2016-05-03T05:05:01Z

@jgangemi

[DEBUG] http-outgoing-35 >> "[\r][\n]"
[DEBUG] http-outgoing-34 << "8[\r][\n]"
[DEBUG] http-outgoing-34 << "[0x1][0x0][0x0][0x0][0x0][0x0][0x0]'[\r][\n]"
[DEBUG] http-outgoing-34 << "27[\r][\n]"
[DEBUG] http-outgoing-34 << "2016-01-28T15:57:10.338038105Z .......[\n]"
[DEBUG] http-outgoing-34 << "[\r][\n]"
[DEBUG] http-outgoing-34 << "HTTP/1.1 204 No Content[\r][\n]"
[DEBUG] http-outgoing-34 << "Server: Docker/1.9.1 (linux)[\r][\n]"
[DEBUG] http-outgoing-34 << "Date: Thu, 28 Jan 2016 15:57:10 GMT[\r][\n]"
[DEBUG] http-outgoing-34 << "[\r][\n]"

What's strange here is that the response stream contains still some input from a log reading connection first before the HTTP response header is received. This could be the reason for the hang since the http client still wait for a valid response, I guess.

So the problem seems to be that connections are reused to early by HcClient (i.e. when still in progress of processing log output) ? You probably dont have the logs anymore, but when this happens the next time, could you check in the log where the blocking connection is also used (here it would be http-outgoing-34) ?

rhuss · 2016-05-03T05:06:23Z

I think this is normal behaviour of Apache's HcClient to send the request on one thread and receive it on another. At least this is what I alway see for every request.

@brian98point6 Sorry, I misunderstood the logs. What you see are not threads, but connection objects which are potentially reused because the plugin uses a pooling thread connector.

rhuss · 2016-05-03T05:36:19Z

If someone would have full log with debug enabled for the blocking case, this would help me quite a bit.

The sequence of connection leasing / using / releasing can be best seen with

 mvn -X install | grep " Connection "

There seems to be issues with long running HTTP request like fetching logs with chunked encodings and pooling connections. It seems that connections get released to soon and then reused in a different context when there is still log data to read. Now logging uses a non-pooled, single connection client for doing async log fetching which hopefully resolves the issues reported in #344 and #259.

brian98point6 · 2016-05-03T18:24:38Z

Thanks for clarifying threads vs. connection objects. You can tell I'm just guessing around :)

Output of grep " Connection" of the debug log I saved last week:
http://pastebin.com/tJUzrYKY

I noticed that there was a "route allocated: 0" and found that http-outgoing-1 was closed when my postgres container was ready, and then used again during the db-migrate container run. Is that odd?
http://pastebin.com/hSNDfxUW

rhuss · 2016-05-03T19:43:40Z

@brian98point6 Thanks a lot ;-). BTW, could you try 0.15.1 released just today ? I changed the log handling quite a bit to use a dedicated HttpClient, so it shouldnt interfere with the rest anymore. from my feeling, this should fix a lot of the issues we had.

would be happy if you could give it a try ;-)

jgangemi · 2016-05-03T20:40:02Z

i haven't seen this problem happen for a little while. @brian98point6, if you still see this problem happening, can you try upgrading your version of docker?

unfortunately i didn't record the version i was using when i filed this issue so it's also possible some of this issue was on docker's end.

brian98point6 · 2016-05-03T22:36:13Z

@rhuss I tried v0.15.1 with the older docker version. It's successful 5 for 5, looking good. Thanks!!

@jgangemi I tried v0.15.0 after upgrading docker from 1.7.0 to 1.9.1. It's still failing about half the time. Are you on the latest version of docker now?

Meanwhile, I'll use 0.15.1 in our CI pipeline and confirm that the issue is resolved by tomorrow.

brian98point6 · 2016-05-04T00:28:56Z

I've run maybe another 20 times. No hanging. So I think the issue is resolved with v0.15.1. Thank you so much!

rhuss · 2016-05-09T05:19:19Z

@brian98point6 @jgangemi I found some issues probably within the Unix socket implementation. How do you connect to the Docker daemon, via TCP or via Unix socket ?

jgangemi · 2016-05-09T13:51:06Z

both - on my mac it's tcp, on linux it's the unix socket.

brian98point6 · 2016-05-09T16:48:07Z

Ditto, but we haven't been able to reproduce this issue on mac/tcp, only on linux/unix sockets. BTW, I'm on the same team as @thomasvandoren. Mentioning this because I assume this is related to #436 (comment)

jgangemi · 2016-05-11T18:35:33Z

i'm going to try and see if i can verify if this is a unix socket vs tcp thing myself right now b/c this issue has crept back up for us.

jgangemi · 2016-05-12T15:05:15Z

ok - tcp works fine, unix socket does not, however where it decides to hang in our rather large reactor build is non-deterministic.

rhuss · 2016-05-12T15:09:35Z

In my analysis it looked like that the unix socket get access gets confused sometimes when accessing from multiple threads. We use a second thread for checking the log wait condition asynchronously, so a prober solution would be to avoid this second thread and do the check on the main thread which also prints the loggigng.

Currently I don't see another solution except fixing the jcr-unix socket library.

jgangemi · 2016-05-12T15:11:13Z

is there another unix socket library we could try?

rhuss · 2016-05-12T15:12:12Z

not that I'm aware of.

jgangemi · 2016-05-12T15:13:31Z

from a quick google search...

https://github.com/mcfunley/juds
https://github.com/kohlschutter/junixsocket

zymzxq · 2016-07-04T06:39:04Z

We are still seeing this problem with 0.15.9

rhuss · 2016-07-04T06:42:12Z

I tried junixsocket, but it had even more severe issues. I really think the single thread solution is the way to go, but it's not a small task. Hopefully we can tackle that soon, but we also love contributions ;-)

jgangemi · 2016-07-04T17:11:23Z

the last reason of juds was in february, so perhaps that might be worth trying out.

in the meantime, using the the tcp socket is a work around.

brian98point6 mentioned this issue May 2, 2016

Invalid log format for ... Failed to' (expected: "<timestamp> <txt>") #259

Open

rhuss mentioned this issue May 25, 2016

docker-maven-plugin hangs when starting a container #458

Open

rhuss added bug difficult labels Oct 5, 2016

rhuss mentioned this issue Oct 5, 2016

Second execution failed if two projects use the plugin #574

Closed

RealTYPICAL mentioned this issue Jan 24, 2024

Stop Using a Deprecated Docker Client openanalytics/containerproxy#83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

weird behavior during log wait condition on during container start #344

weird behavior during log wait condition on during container start #344

jgangemi commented Dec 3, 2015

rhuss commented Jan 20, 2016

jgangemi commented Jan 28, 2016

rhuss commented Jan 28, 2016

jgangemi commented Jan 28, 2016

jgangemi commented Jan 28, 2016

jgangemi commented Jan 29, 2016

brian98point6 commented Apr 29, 2016

brian98point6 commented Apr 30, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

brian98point6 commented May 3, 2016

rhuss commented May 3, 2016

jgangemi commented May 3, 2016

brian98point6 commented May 3, 2016

brian98point6 commented May 4, 2016

rhuss commented May 9, 2016

jgangemi commented May 9, 2016

brian98point6 commented May 9, 2016

jgangemi commented May 11, 2016

jgangemi commented May 12, 2016

rhuss commented May 12, 2016

jgangemi commented May 12, 2016

rhuss commented May 12, 2016

jgangemi commented May 12, 2016

zymzxq commented Jul 4, 2016

rhuss commented Jul 4, 2016

jgangemi commented Jul 4, 2016 •

edited

Loading

weird behavior during log wait condition on during container start #344

weird behavior during log wait condition on during container start #344

Comments

jgangemi commented Dec 3, 2015

rhuss commented Jan 20, 2016

jgangemi commented Jan 28, 2016

rhuss commented Jan 28, 2016

jgangemi commented Jan 28, 2016

jgangemi commented Jan 28, 2016

jgangemi commented Jan 29, 2016

brian98point6 commented Apr 29, 2016

brian98point6 commented Apr 30, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

rhuss commented May 3, 2016

brian98point6 commented May 3, 2016

rhuss commented May 3, 2016

jgangemi commented May 3, 2016

brian98point6 commented May 3, 2016

brian98point6 commented May 4, 2016

rhuss commented May 9, 2016

jgangemi commented May 9, 2016

brian98point6 commented May 9, 2016

jgangemi commented May 11, 2016

jgangemi commented May 12, 2016

rhuss commented May 12, 2016

jgangemi commented May 12, 2016

rhuss commented May 12, 2016

jgangemi commented May 12, 2016

zymzxq commented Jul 4, 2016

rhuss commented Jul 4, 2016

jgangemi commented Jul 4, 2016 • edited Loading

jgangemi commented Jul 4, 2016 •

edited

Loading