WebSocket FragmentExtension can produce an invalid stream of frames #2491

jeremystone · 2018-04-26T11:08:25Z

When a number of large fragmented messages interspersed with smaller non-fragmented messages are sent over a websocket to a slow client (or over a slow network), the client sometimes sees one of the smaller messages before the final fragment of the previously sent larger message.

The stack trace on the client is:

org.eclipse.jetty.websocket.api.ProtocolException: Unexpected TEXT frame, was expecting CONTINUATION
	at org.eclipse.jetty.websocket.common.Parser.parseFrame(Parser.java:329)
	at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:241)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:560)
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:391)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:319)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:175)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:133)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:754)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:672)
	at java.lang.Thread.run(Thread.java:748)

Note that we first noticed this with a jetty 9.4.9 server and a netty based client (and netty detects the same thing) so the problem is likely with the fragment creation (in this case on the server end) rather than with the fragment re-assembly on the client.

I have created this gist which is a JUnit test case that shows that this seems to happen only with 'slow' readers. (Our original netty/jetty setup required a moderately slow internet link or something like NetLimiter to reliably reproduce it.)

Please let me know if you need any further information.

The text was updated successfully, but these errors were encountered:

joakime · 2018-04-26T11:45:46Z

Bizarre.

The stacktrace shows a parser Exception.
That's single threaded and reads/parses the stream. Can't see how we could screw that up.
We've been (since 2009) constantly test our implementations against autobahn (the de facto websocket standard testing lib) and pywebsocket (the lib maintained by google/chrome).
This kind of bug would be surprising.

jeremystone · 2018-04-26T12:41:04Z

I added some logging to the test and grepped everything out except the just the 'nextOutgoingFrame' debug logging in FragmentExtension (see log for failing test here).

It appears as if everything happens initially on the 'server response' thread (which is what I called the single thread that does the send) neatly passing on 10 fragmented frames then a short full message frame. But after a while some stuff happens on a 'qtp523691575-17' thread.

Not sure if this is relevant(?)

joakime · 2018-04-26T14:55:05Z

From the wireshark, its the sending side that's messing up.
There's overlapping messages seen on the wire.

What the testcase seems to be doing.

Start client socket with knowledge of testcase scenario (size of message, and how much time to sleep on read side between whole messages)
Connect to server.
In client.onConnect() send message
On server.onMessage() - start new thread, using sendString() to send messages to remote.
- The server will send the large message, then a short message, in a loop until the predetermined number of messages have been sent.
On client.onMessage() simulate slowness by using thread.sleep on whole message receive.

Initial blush: There's no regard for backpressure on the write side of this use case.

Initial guess (needs more testing):

Using sendString() from another thread isn't a great idea.
You'll need to use the sendStringByFuture() or sendString(msg, callback) and check that the message left the java network stack before you send another.
This is the only way you'll know about backpressure from the remote endpoint from another thread.
I'll try working that change in and see what happens.

joakime · 2018-04-26T14:56:29Z

You can follow along at the test project at - https://github.com/joakime/websocket-issue-2491

jeremystone · 2018-04-26T15:07:12Z

OK. Thanks. Glad you're seeing it too.

joakime · 2018-04-26T15:13:22Z

The backpressure behaviors are not the cause.

Need to dig into WebSocketRemoteEndpoint and FrameFlusher to see if the fault lies there.

joakime · 2018-04-26T16:18:18Z

WebSocketRemoteEndpoint and FrameFlusher are not at fault.

Taking a closer look at the FragmentExtension now.

joakime · 2018-04-26T16:20:24Z

Note: we are planning on eliminating the FragmentExtension as its incompatible with the WebSocket over HTTP/2 drafts that enforce fragmentation anyway.

+ Adding testcase for RemoteEndpoint + Adding testcase for FrameFlusher Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>

joakime · 2018-04-26T16:28:10Z

I've been able to narrow down the fault to the FragmentExtension. I have a testcase that can now replicate the issue in isolation of all of the rest of the components.

+ Adding testcase to ensure no regression + All data frames that arrive, are now sent through the IteratingCallback to ensure that that input frame order is preserved. Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>

joakime · 2018-04-26T18:00:12Z

Fixed in commit c596fca on branch jetty-9.4.x-issue-2491-ws-fragmentextension

Here's what was happening.

The FragmentExtension would enqueue data frames (TEXT/BINARY/CONTINUATION) if they were over the configured maxLength.
The FragmentExtension had an internal flusher from that queue processing large payloads out as fragments out with support for backpressure.
We had a large frame in the queue, being processed.
A smaller frame arrives that is under the configured maxLength and it skipped the queue and was immediately sent. (a bug that was just fixed)
The callbacks related to the original Frame was being notified of success on every fragment being produced. (a bug that was just fixed)

jeremystone · 2018-04-26T19:08:44Z

Well done for getting to the bottom of that one! Will there be a release soon do you know, or should we build from sources? Many thanks.

…

On Thu, 26 Apr 2018 19:00 Joakim Erdfelt, ***@***.***> wrote: Fixed in commit c596fca <c596fca> on branch jetty-9.4.x-issue-2491-ws-fragmentextension Here's what was happening. 1. The FragmentExtension would enqueue data frames (TEXT/BINARY/CONTINUATION) if they were over the configured maxLength. 2. The FragmentExtension had an internal flusher from that queue processing large payloads out as fragments out with support for backpressure. 3. We had a large frame in the queue, being processed. 4. A smaller frame arrives that is under the configured maxLength and it skipped the queue and was immediately sent. (a bug that was just fixed) 5. The callbacks related to the original Frame was being notified of success on every fragment being produced. (a bug that was just fixed) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2491 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGJ5EWAID2qtsNqQDpTUg4m97srpVgKMks5tsgs4gaJpZM4Tk-Jf> .

joakime · 2018-04-26T21:33:35Z

9.4.10 final release should occur sometime within the next 2 weeks.
Once I get a CI verification of this fix, I'll push a SNAPSHOT at least (for you)

…entextension Issue #2491 WebSocket FragmentExtension produces out of order Frames

joakime · 2018-04-27T10:54:27Z

@jeremystone please try Jetty 9.4.10.RC1 (currently on maven central)

jeremystone · 2018-04-27T12:48:15Z

9.4.10.RC1 looks good. Not been able to reproduce with this. Thanks again.

joakime · 2018-04-27T13:21:40Z

No, thank you for the issue report with reproducer! (so important to get to the bottom of the issue)

joakime self-assigned this Apr 26, 2018

joakime added a commit that referenced this issue Apr 26, 2018

Issue #2491 - Fragmented Messages arrive out of order.

5add914

+ Adding testcase for RemoteEndpoint + Adding testcase for FrameFlusher Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>

joakime mentioned this issue Apr 26, 2018

Issue #2491 WebSocket FragmentExtension produces out of order Frames #2493

Merged

joakime added a commit that referenced this issue Apr 26, 2018

Issue #2491 - minor revert based on review

76dec16

joakime added a commit that referenced this issue Apr 26, 2018

Merge pull request #2493 from eclipse/jetty-9.4.x-issue-2491-ws-fragm…

38cd903

…entextension Issue #2491 WebSocket FragmentExtension produces out of order Frames

joakime changed the title ~~Fragmented websocket frames sometimes received out of order for slow clients~~ WebSocket FragmentExtension can produce an invalid stream of frames Apr 26, 2018

joakime closed this as completed Apr 27, 2018

dreis2211 mentioned this issue Apr 27, 2018

WebSocket hangs in blockingWrite #2061

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebSocket FragmentExtension can produce an invalid stream of frames #2491

WebSocket FragmentExtension can produce an invalid stream of frames #2491

jeremystone commented Apr 26, 2018 •

edited by joakime

Loading

joakime commented Apr 26, 2018 •

edited

Loading

jeremystone commented Apr 26, 2018

joakime commented Apr 26, 2018 •

edited

Loading

joakime commented Apr 26, 2018

jeremystone commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

jeremystone commented Apr 26, 2018 via email

joakime commented Apr 26, 2018

joakime commented Apr 27, 2018

jeremystone commented Apr 27, 2018

joakime commented Apr 27, 2018

WebSocket FragmentExtension can produce an invalid stream of frames #2491

WebSocket FragmentExtension can produce an invalid stream of frames #2491

Comments

jeremystone commented Apr 26, 2018 • edited by joakime Loading

joakime commented Apr 26, 2018 • edited Loading

jeremystone commented Apr 26, 2018

joakime commented Apr 26, 2018 • edited Loading

joakime commented Apr 26, 2018

jeremystone commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

joakime commented Apr 26, 2018

jeremystone commented Apr 26, 2018 via email

joakime commented Apr 26, 2018

joakime commented Apr 27, 2018

jeremystone commented Apr 27, 2018

joakime commented Apr 27, 2018

jeremystone commented Apr 26, 2018 •

edited by joakime

Loading

joakime commented Apr 26, 2018 •

edited

Loading

joakime commented Apr 26, 2018 •

edited

Loading