-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Netty connector hang up after repeated buffer overflow errors when writing data #5753
Comments
I am running and the program finishes normally. Could you provide a thread dump when it hangs?
|
Did you inject the IOExceptions in the JerseyChunkedInput class as I suggested? Because I don't see that output in your test above; with the exceptions injected, I see this:
At that point the test hung for me. Here also is a thread dump of when it occurs. At least part of the problem occurs because thread nioEventLoopGroup-2-4 is in an infinite loop withint he doFlush() method. The method will, as long as the channel is writable, attempt to read from the JerseyChunkedInput queue that had the IOException. Though pulling from the queue times out every 2 seconds, the logic in doFlush() is such that the loop doesn't exit. It would guess (but I'm quite unfamiliar with the code) that the channel ought to have been closed (and hence unwritable) as part of the IOException handling. |
exceptions seen in production logs - |
I am reproducing it now. It works if |
…iting data eclipse-ee4j#5753 Signed-off-by: Jorge Bescos Gascon <jorge.bescos.gascon@oracle.com>
…iting data eclipse-ee4j#5753 Signed-off-by: Jorge Bescos Gascon <jorge.bescos.gascon@oracle.com>
…iting data eclipse-ee4j#5753 Signed-off-by: Jorge Bescos Gascon <jorge.bescos.gascon@oracle.com>
Closing, as the fix was merged |
We have some code that repeatedly makes async JAX-RS calls requests using the Netty Connector; the basic call is quite simple:
The attached example program executes these calls within a loop, for a given number of threads. Under normal circumstances, this works fine: all the requests go through and get processed, and the test program exits. Sometimes, however, the test program will hang: it will simply cease processing the remaining results.
We have tracked this down to a problem when the writes to the remote system are buffered within JerseyChunkedInput.write(), which puts the data on a queue of size 8 (by default). If that queue is full, the the write() method throws an IOException. For a single request, this IOException is propagated up correctly; the CompletableFuture reports the exception and processing can continue. After some number of these exceptions, however, the program hangs: the netty stack has somehow lost track of its callbacks/promises.
The easiest way to reproduce this is to modify the JerseyChunkedInput.write() method to periodically throw an IOException, something like this:
WIth that in place, the attached test program will run for awhile. After about 28 calls, it will get quite sluggish, and after about 34 calls it will hang altogether. (Those numbers may likely be different on other systems.
To run the attached program, the pom dependency is
Then it needs three arguments: the URL to call, the number of threads, and the number of time each thread should call, so something like:
java <cp> com.oracle.psr.nettybug.NettyBug http://100.105.9.29:7001/console/login/LoginForm.jsp 5 10
nettybug.tar.gz
The text was updated successfully, but these errors were encountered: