Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully stopped Jetty not flushing all response data #4047

Closed
tsurovec opened this issue Sep 2, 2019 · 10 comments
Closed

Gracefully stopped Jetty not flushing all response data #4047

tsurovec opened this issue Sep 2, 2019 · 10 comments

Comments

@tsurovec
Copy link

tsurovec commented Sep 2, 2019

import org.eclipse.jetty.server.Handler;
import org.eclipse.jetty.server.Request;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.server.ServerConnectionStatistics;
import org.eclipse.jetty.server.handler.AbstractHandler;
import org.eclipse.jetty.server.handler.StatisticsHandler;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;

public class NotGraceful extends AbstractHandler {
	public static void main(String... argv) throws Exception {
		Server server = new Server(8080);
		server.setStopAtShutdown(true);
		server.setStopTimeout(30_000);
		StatisticsHandler statisticsHandler = new StatisticsHandler();

		Handler handler = new NotGraceful();

		statisticsHandler.setHandler(handler);
		server.setHandler(statisticsHandler);
		ServerConnectionStatistics.addToAllConnectors(server);

		server.start();
		server.join();
	}

	@Override
	public void handle(String s, Request baseRequest, HttpServletRequest request, HttpServletResponse response) throws IOException {
		response.setContentLength(2);

		response.getOutputStream().write("a".getBytes());

		try {
			Thread.sleep(10_000);
		}
		catch (InterruptedException e)
		{
			e.printStackTrace();
		}

		response.getOutputStream().write("b".getBytes());
	}
}

This should run an embedded Jetty server with configured graceful shutdown and a simulated long running request.

Reproduction (with Jetty 9.4.14.v20181114):

  1. curl -vvv http://localhost:8080
  2. invoke shutdown while the request from step 1 is still in flight: jps | grep NotGraceful | cut -d' ' -f1 | xargs kill

Expected:
Jetty will let the request finish gracefully, with "ab" being sent to the client.

Actual:

*   Trying ::1:8080...
* TCP_NODELAY set
* Connected to localhost (::1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.65.3
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 02 Sep 2019 09:50:27 GMT
< Content-Length: 2
< Connection: close
< Server: Jetty(9.4.14.v20181114)
< 
* transfer closed with 1 bytes remaining to read
* Closing connection 0
curl: (18) transfer closed with 1 bytes remaining to read

The second byte isn't sent to the client.

The processing of the response hit the HttpGenerator.Result result = _generator.generateResponse(_info, _head, _header, chunk, _content, _lastContent); line in org.eclipse.jetty.server.HttpConnection.SendCallback#process 5 times - once with NEED_HEADER result and then twice with FLUSH and DONE results. The first DONE then signaled that the socket can be closed:

case DONE:                                                                                                                                                                             
{                                                                                                                                                                                       
// If shutdown after commit, we can still close here.                                                                                                                                   
if (getConnector().isShutdown())                                                                                                                                                        
    _shutdownOut=true;                                                                                                                                                                  
                                                                                                                                                                                        
return Action.SUCCEEDED;                                                                                                                                                                
}

so the second flush with the remaining data wasn't able to send it.

@sbordet
Copy link
Contributor

sbordet commented Sep 16, 2019

This issue is similar to #2788 for what the user seems to would like (i.e. having the server wait for all the in-flight requests to finish before closing all connections).

However, we don't have such feature yet although a few bits and pieces are there - but we need to come up with a clear API and do all the things that needs to be done for a graceful shutdown.

  • Stop the acceptor only (no more new connections).
  • Trigger stop of the various higher level components that handle applications and requests, and wait for a configurable timeout that they finish.
  • If they don't finish, hard stop.
  • Make sure requests on existing connections are rejected.
  • Stop the lower level components such as SelectorManager and ThreadPool.
  • Be notified with a Callback when the graceful stop is finished (possibly a failure when the timeout expires).

@joakime
Copy link
Contributor

joakime commented Sep 16, 2019

Stopping the acceptor would result in clients able to start the connect, bu not actually succeed to connect, right?

We've had others ask that all new requests during shutdown actually be responded to with 503 Unavailable during shutdown.

@sbordet
Copy link
Contributor

sbordet commented Sep 16, 2019

Not only not accept new connections, but also rejecting requests on existing connections.

@tsurovec
Copy link
Author

tsurovec commented Sep 16, 2019

In this case the connection is accepted before the shutdown starts and the request is finished gracefully from the application's point of view, so no problem there.
Correction: the second write throws org.eclipse.jetty.io.EofException; but otherwise Jetty seems to let the request finish

The problem is that not all data make it to the client.

One observation: when content length is set AFTER the data is written, the problem disappears:

...
@Override
public void handle(String s, Request baseRequest, HttpServletRequest request, HttpServletResponse response) throws IOException {
		response.getOutputStream().write("a".getBytes());

		try {
			Thread.sleep(10_000);
		}
		catch (InterruptedException e)
		{
			e.printStackTrace();
		}

		response.getOutputStream().write("b".getBytes());
		response.setContentLength(2);
	}
...

@gregw
Copy link
Contributor

gregw commented Sep 16, 2019

@sbordet The intention of the graceful stop of Statistics handler was that it would work with the Server.stop to provide more or less the sequence you describe above. This is a feature we do have and was working. If it is not working now it is because it is broken, not because it is not implemented.

@gregw
Copy link
Contributor

gregw commented Sep 17, 2019

It looks like we are being overzealous with our async close handling. When the handler tries to write "b" it is getting a:

org.eclipse.jetty.io.EofException
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:283)
	at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
	at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
	at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
	at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:814)
	at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
	at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
	at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:539)
	at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:842)
	at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:914)
	at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:259)
	at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:235)
	at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:641)
	at java.base/java.io.OutputStream.write(OutputStream.java:122)
	at org.eclipse.jetty.embedded.Graceful.handle(Graceful.java:35)
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:173)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.eclipse.jetty.server.Server.handle(Server.java:500)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:386)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:560)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:378)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:783)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:919)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.nio.channels.AsynchronousCloseException
	at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:261)

@gregw
Copy link
Contributor

gregw commented Sep 17, 2019

Interesting observation about the content-length. Having that set before writing "b" causes the response to be committed within the write! I note that if I don't set the content length at all and let the commit happen after the return, it also works?

interesting!

@gregw
Copy link
Contributor

gregw commented Sep 17, 2019

There appears to be a problem in our aggregation / generation!

When the "b" is written, we see that it is the last byte of the content and thus trigger a flush. But in the aggregate buffer we only have the "a". So we flush HEADERS+"a" and then the generator says DONE! only after that do we flush the "b" in a separate write... which should not work... but does somehow because normally we are persistent... but when doing a graceful shutdown we do:

                    case DONE:
                    {
                        // If shutdown after commit, we can still close here.
                        if (getConnector().isShutdown())
                            _shutdownOut = true;

                        return Action.SUCCEEDED;
                    }

Thus the late write of "b" fails because we have shutdown the output!!!

So why are we signalling DONE before we have written the "b" ?
Why aren't we aggregating the "b" with the "a".. and if not, why are we saying it is the last write?

@gregw
Copy link
Contributor

gregw commented Sep 17, 2019

I think we broke this in #2772

gregw added a commit that referenced this issue Sep 17, 2019
Added test to reproduce issue
Fixed bug from #2772 where output was shutdown on DONE without checking for END.
Fixed aggregation logic to aggregate last write if aggregation already started.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
gregw added a commit that referenced this issue Sep 25, 2019
Added test to reproduce issue
Fixed bug from #2772 where output was shutdown on DONE without checking for END.
Fixed aggregation logic to aggregate last write if aggregation already started
Improved comments and clarify conditions

Signed-off-by: Greg Wilkins <gregw@webtide.com>
@joakime
Copy link
Contributor

joakime commented Sep 25, 2019

PR #4094 merged.

@joakime joakime closed this as completed Sep 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants