SSE for Rocket 0.4.x #1365

ijackson · 2020-07-05T01:00:49Z

Hi. I'm writing a Rocket application which uses JS SSE. I had some difficulties getting my events to get through in a timely fashion. After some investigation, I came up with this MR.

With this MR, I am able to use SSE with Rocket as expected. I implement a Read which blocks and dribbles out data as needed. (I think I will need to increase my thread pool size and I will probably have to do the multiple-domains-SSE trick to avoid the SSE connection limit bug.)

Please see the commit message of the 2nd commit for lots of discussion about my API design strategy. The API is certainly not as pretty as it might be in some alternative universe, but I think it's about as good as we're going to get for this one. At least it doesn't get in the way of 'normal' use of Rocket.

To save you digging it out of the github UI here it is:

Stream: Provide a way to flush chunks, to support SSE

Problem:

To support Server-Side Events (SSE, aka JS EventSource) it is
necessary for the server to keep open an HTTP request and dribble out
data (the event stream) as it is generated.

Currently, Rocket handles this poorly.  It likes to fill in complete
chunks.  Also there is no way to force a flush of the underlying
stream: in particular, there is a BufWriter underneath hyper.  hyper
would honour a flush request, but there is no way to send one.

Options:

Ideally the code which is producing the data would be able to
explicitly designate when a flush should occur.  Certainly it would
not be acceptable to flush all the time for all readers.

1. Invent a new kind of Body (UnbufferedChunked) which translates the
data from each Read::read() call into a single call to the stream
write, and which always flushes.  This would be a seriously invasive
change.  And it would mean that SSE systems with fast event streams
might work poorly.

2. Invent a new kind of Body which doesn't use Read at all, and
instead has a more sophisticated API.  This would be super-invasive
and heavyweight.

3. Find a way to encode the necessary information in the Read trait
protocol.

Chosen solution:

It turns out that option 3 is quite easy.  The read() call can return
an io::Error.  There are at least some errors that clearly ought not
to occur here.  An obvious one is ErrorKind::WouldBlock.

Rocket expects the reader to block.  WouldBlock is only applicable to
nonblocking objects.  And indeed the reader will generally want to
return it (once) when it is about to block.

We have the Stream treat io::Error with ErrorKind::WouldBlock, from
its reader, as a request to flush.  There are two effects: we stop
trying to accumulate a full chunk, and we issue a flush call to the
underlying writer (which, eventually, makes it all the way down into
hyper and BufWriter).

Implementation:

We provide a method ReadExt::read_max_wfs which is like read_max but
which handles the WouldBlock case specially.  It tells its caller
whether a flush was wanted.

This is implemented by adding a new code to read_max_internal.  with a
boolean to control it.  This seemed better than inventing a trait or
something.  (The other read_max call site is reading http headers in
data.rs, and I think it wants to tread WouldBlock as an error.)

Risks and downsides:

Obviously this ad-hoc extension to the Read protocol is not
particularly pretty.  At least, people who aren't doing SSE (or
similar) won't need it and can ignore it.

If for some reason the data source is actually nonblocking, this new
arrangement would spin, rather than calling the situation a fatal
error.  This possibility seems fairly remote, in production settings
at least.  To migitate this it might be possible for the loop in
Rocket::issue_response to bomb out if it notices it is sending lots of
consecutive empty chunks.

It is possible that async Rocket will want to take a different
approach entirely.  But it will definitely need to solve this problem
somehow, and naively it seems like the obvious transformation to eg
the Tokio read trait would have the same API limitation and admit the
same solution.  (Having a flush occur every time the body stream
future returns Pending would not be good for performance, I think.)

Background and references:

I found these issues already:

(async) Add a Write-like interface for responses #1066 (async) Add a Write-like interface for responses: I guess this may obviate the need for the approach I have taken in 0.4. I'll leave you to think about that :-).
Support for Server Sent Events #33 Support for Server Sent Events. I think this MR would fix Support for Server Sent Events #33 for 0.4. I haven't looked at 0.5 at all but it seems likely that a similar approach would be possible.
Native WebSocket support #90 Native WebSocket support. Well that's a good idea, but not closely related. I thougt I should mention it because WebSockets are an alternative way to solve some of the problems that SSE addresses.

PS: Thanks for Rocket. This is my 2nd Rocket application so consider yourselves appreciated :-).

igalic · 2020-07-05T13:39:57Z

the amount of documentation and reasoning in this pr is absolutely 😻

SergioBenitez · 2020-07-23T01:22:32Z

This is very clever. Excellent proposal, @ijackson! My primary concern is backwards compatibility. Previously, an Err(WouldBlock) would result in an error, whereas now it does not. This, this is technically a breaking change.

However, the documentation for WouldBlock clearly states (emphasis mine):

The operation needs to block to complete, but the blocking operation was requested to not occur.

That is, the error should only be returned if non-blocking I/O was requested. Thus, we should be in the clear.

Nevertheless, there is no guarantee, especially because we can't guarantee that a non-blocking reader from the user was passed in. Can we mitigate the chance of a true breakage? One idea is to check if the request returns WouldBlock twice and return the Error if so. Presumably, if a true WouldBlock is returned once, it will be returned twice. A Reader in the know would only return it once, of course.

@jebrosen I am in favor of this, pending a resolution to the above as well as your thoughts.

jebrosen · 2020-07-23T03:42:40Z

Assuming this actually works as intended (I haven't made any attempt at testing it!), I'm generally on board.

Instead of assigning special meaning to WouldBlock, could we require io::Error::new(_, rocket::response::RequestFlush), and check this against the Error's source() with <dyn Error>::is::<RequestFlush>? I admit that checking the source() is probably slightly more expensive than checking the kind(), but it feels slightly less hacky to me.

ijackson · 2020-07-23T11:08:57Z

Sergio Benitez writes ("Re: [SergioBenitez/Rocket] SSE for Rocket 0.4.x (#1365)"):

This is very clever. Excellent proposal, @ijackson!

Thanks for your kind words :-).

My primary concern is backwards compatibility. Previously, an Err(WouldBlock) would result in an error, whereas now it does not. This, this is technically a breaking change.

Indeed.

However, the documentation for WouldBlock clearly states (emphasis mine): The operation needs to block to complete, but the blocking operation was requested to not occur. That is, the error should only be returned if non-blocking I/O was requested. Thus, we should be in the clear.

The non-blocking I/O would have to have been requested of the underlying `Read`. This is possible on Unix, for example, by setting the O_NONBLOCK flag. Any such situation is already the result of a bug, since Rocket's request streamer needs a blocking object. I have two limbs to my argument for this being unlikely. (Ie for my proposal having a very low risk of actually breaking anyone's downstream code.) ### Mistaken O_NONBLOCK ### It's not hugely common, but it does occasionally happen that a file in nonblocking mode is passed by mistake to something that expects a blocking file. This typically happens when a single open-file object is shared between programs with different ideas about how to manage concurrency. But the `Read` passed to Rocket here is specific to the request, so it was almost certainly made within the Rocket application. No-one will have wanted to use it in a non-blocking async way. ### Existing consequences for such a program ### If, despite what I have just said, such a bug existed: right now the consequence would be that the program would not work properly: `read` would return `WouldBlock` and the request would fail. The only way this could survive in a working system would be if the request body were ignored completely; or perhaps if the request failed only after having transferred some immediately-available data. In the latter case there would probably be a race, so that it would occasionally produce weirdly truncated responses - ie, the application would already be unreliable. ### Conclusion on risk of breakage ### So I don't think I can 100% rule out that someone might have a working system that would (more or less) work, despite having this bug. But it seems quite improbable to me. NB that I'm basing my argument on practicalities. It seems to me that the right approach is not to ask in a theoretical legalistic way whether this change is justified according to the specs (although your quote above suggests strongly that it is OK). Rather, the point of all these semver rules is to maintain the workiness of people's systems. So we should worry whether we are actually breaking stuff. ### Proposed alternatives ###

Nevertheless, there is no guarantee. Can we mitigate the chance of a true breakage? One idea is to check if the request returns WouldBlock twice and return the Error if so. Presumably, if a true WouldBlock is returned once, it will be returned twice. An Reader in the know would only return it once.

This would be possible. However, it imposes on the SSE responder a duty to track whether it has sent a WouldBlock and make sure it doesn't send two in a row. Jeb Rosen writes ("Re: [SergioBenitez/Rocket] SSE for Rocket 0.4.x (#1365)"):

Assuming this actually works as intended (I haven't made any attempt at testing it!), I'm generally on board.

I have been using this and it works well. I haven't deployed the app in question yet - it's still in development - but since deploying this trick my SSE has been 100% reliable.

Instead of assigning special meaning to WouldBlock, could we require io::Error::new(_, rocket::response::RequestFlush), and check this against the Error's source() with <dyn Error>::is::<RequestFlush>? I admit that checking the source() is probably slightly more expensive than checking the kind(), but it feels slightly less hacky to me.

That would be a possibility. (I think you mean `std::io::Error::get_ref`, not `source()`.) But the performance implications are not brilliant. The costs are not primarily in the check (where it's an additional dereference and then a vtable pointer comparison). It's in the representation. An `io::Error` which contains a further error object boxes *twice*: once to wrap up the contained error and again to fit the custom error and the kind and the vtable in the two-small `io::Error` object. This means that generating this magic error would involve allocating twice, and the code in Rocket which handles this error will free twice. With my original proposal, constructing one of these magic errors does not involve boxing at all, because the error with kind but without contained error fits into a `std::io::error::Repr::Simple`. (Looking at that struct, an alternative would be to abuse the os error codes. I think choosing an OS error code to abuse, in a portable way, would be worse. So I don't think that's a good approach.) ### Overall conclusion ### I will leave it up to you as maintainers to decide what you think the best way forward is. ISTM that the plausible alternatives are checking `ErrorKind::WouldBlock` or the suggested `Error::get_ref`. Ian.

…

-- Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.

ijackson · 2020-09-06T21:46:37Z

I thought of a better way to mitigate the risk of breakage: put this behind a feature gate. ISTM that rocket applications that want this will in any case want to opt in to it, so that will be fine. Perhaps some library in SSE-using Rocket application will cause trouble by generating spurious WouldBlock errors, but having to opt into this feature will mean seeing the docs which mention this. And definitely if we make it a non-default feature no existing users will be adversely affected.

If you like this idea I will send an updated MR. I'm quite keen to drop my vendored copy of Rocket...

SergioBenitez · 2020-09-12T10:15:01Z

@ijackson Yes, I think that would adequately resolve my concerns.

We are going to want to provide a more sophosticated entrypoint in a moment. This new function is going to get a new call site from a new function in ReadExt. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

Problem: To support Server-Side Events (SSE, aka JS EventSource) it is necessary for the server to keep open an HTTP request and dribble out data (the event stream) as it is generated. Currently, Rocket handles this poorly. It likes to fill in complete chunks. Also there is no way to force a flush of the underlying stream: in particular, there is a BufWriter underneath hyper. hyper would honour a flush request, but there is no way to send one. Options: Ideally the code which is producing the data would be able to explicitly designate when a flush should occur. Certainly it would not be acceptable to flush all the time for all readers. 1. Invent a new kind of Body (UnbufferedChunked) which translates the data from each Read::read() call into a single call to the stream write, and which always flushes. This would be a seriously invasive change. And it would mean that SSE systems with fast event streams might work poorly. 2. Invent a new kind of Body which doesn't use Read at all, and instead has a more sophisticated API. This would be super-invasive and heavyweight. 3. Find a way to encode the necessary information in the Read trait protocol. Chosen solution: It turns out that option 3 is quite easy. The read() call can return an io::Error. There are at least some errors that clearly ought not to occur here. An obvious one is ErrorKind::WouldBlock. Rocket expects the reader to block. WouldBlock is only applicable to nonblocking objects. And indeed the reader will generally want to return it (once) when it is about to block. We have the Stream treat io::Error with ErrorKind::WouldBlock, from its reader, as a request to flush. There are two effects: we stop trying to accumulate a full chunk, and we issue a flush call to the underlying writer (which, eventually, makes it all the way down into hyper and BufWriter). Implementation: We provide a method ReadExt::read_max_wfs which is like read_max but which handles the WouldBlock case specially. It tells its caller whether a flush was wanted. This is implemented by adding a new code to read_max_internal. with a boolean to control it. This seemed better than inventing a trait or something. (The other read_max call site is reading http headers in data.rs, and I think it wants to tread WouldBlock as an error.) Risks and downsides: Obviously this ad-hoc extension to the Read protocol is not particularly pretty. At least, people who aren't doing SSE (or similar) won't need it and can ignore it. If for some reason the data source is actually nonblocking, this new arrangement would spin, rather than calling the situation a fatal error. This possibility seems fairly remote, in production settings at least. To migitate this it might be possible for the loop in Rocket::issue_response to bomb out if it notices it is sending lots of consecutive empty chunks. It is possible that async Rocket will want to take a different approach entirely. But it will definitely need to solve this problem somehow, and naively it seems like the obvious transformation to eg the Tokio read trait would have the same API limitation and admit the same solution. (Having a flush occur every time the body stream future returns Pending would not be good for performance, I think.) Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

This eliminates the risk of breakage to existing applications. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

ijackson · 2020-09-29T00:29:57Z

Hi. I made that change to make the new behaviour depend on the "sse" feature flag. Please let me know if you'd like me to squash that into the actual implementation commit.

I see that Azure has tried to test this. I've looked at the logs of the failed tests and they seem to have been "cancelled" for no readily apparent reason. I don't think this is anything to do with what is in my branch. If this isn't a known problem, please let me know where to look to find the relevnt log...

Thanks,
Ian.

inzanez · 2020-10-01T10:26:48Z

I'm really looking forward to this! :-)

ijackson · 2020-10-01T13:45:49Z

Should I do a null rebase and re-force-push to retry the failing CI tests?

SergioBenitez · 2020-10-01T16:41:16Z

Should I do a null rebase and re-force-push to retry the failing CI tests?

Please do!

ijackson · 2020-10-02T12:41:40Z

Please see #1443. That tested a tree identical to current upstream v0.4 and all its tests failed too. So I think these failures are nothing to do with my SSE changes.

Please let me know when you think this is fixed...

jebrosen · 2020-10-02T13:17:19Z

This branch should be re-triable now; there was a bug in downloading nightly via rustup that was fixed just 10 minutes ago: rust-lang/rustup#2504 (comment)

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

ijackson · 2020-10-02T14:49:53Z

Aha! That looks better. Thanks for the help.

You'll see I added a commit providing an example. I haven't provided a #[test] for it because I didn't have time (esp. time to think about buffering in http client libraries), but I have tested it locally with a browser.

This example was derived from an earlier test case of mine, where I had set the chunk size to 1 to try to track down the buffering problem. But the low chunk size is not needed. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

Don't use the explicit chunk size for the Stream. Make std::io::BufReader a `use`, which improves readability. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

The docs say the default is 8kb but "may change". Also, add a comment explaining the relevance of the `Bufreader` and its size. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

ijackson · 2020-10-02T16:34:08Z

I think I have finished tidying this up now. Sorry for the noise and let me know if you would like me to squash any of it.

ijackson · 2020-10-28T18:10:35Z

@SergioBenitez is there anything else you need from me? Thanks.

SergioBenitez · 2020-10-30T04:51:16Z

@ijackson A way to conjure time is always appreciated. ;)

The example didn't actually make use of the WouldBlock feature, so it didn't work. I've fixed the example as well as various style issues. Doing a final review now, then pushing.

SergioBenitez · 2020-10-30T06:58:31Z

Merged in c24a963 with fixes in 3970783. Will prep a new release soon.

ijackson · 2020-10-30T11:45:58Z

@ijackson A way to conjure time is always appreciated. ;)

Haha :-).

The example didn't actually make use of the WouldBlock feature, so it didn't work. I've fixed the example as well as various style issues. Doing a final review now, then pushing.

Sorry about that. I must have broken it after I tested it. Thanks for doing the fixup!

ijackson mentioned this pull request Jul 5, 2020

Support for Server Sent Events #33

Closed

alepez mentioned this pull request Jul 17, 2020

Use SSE for chat notifications alepez/devand#49

Open

jebrosen mentioned this pull request Jul 28, 2020

ResponseBuilder::streamed_body not streaming properly #1389

Closed

ijackson added 3 commits September 28, 2020 16:48

ReadExt: Break out read_max_internal

b0e2a1b

We are going to want to provide a more sophosticated entrypoint in a moment. This new function is going to get a new call site from a new function in ReadExt. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

Put new SSE behaviour behind an sse feature flag

6fd732f

This eliminates the risk of breakage to existing applications. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

ijackson force-pushed the v0.4-for-upstream branch from 39a3334 to 6fd732f Compare September 28, 2020 23:45

SSE: Provide an example

37009bd

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

ijackson force-pushed the v0.4-for-upstream branch from b198f06 to 37009bd Compare October 2, 2020 14:01

ijackson added 3 commits October 2, 2020 15:54

examples/sse: Fix wrong chunk size

6822526

This example was derived from an earlier test case of mine, where I had set the chunk size to 1 to try to track down the buffering problem. But the low chunk size is not needed. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

example/sse: Style fixes

3f8e2b0

Don't use the explicit chunk size for the Stream. Make std::io::BufReader a `use`, which improves readability. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

example/sse: Use an explicit BufReader size

94f5e1e

The docs say the default is 8kb but "may change". Also, add a comment explaining the relevance of the `Bufreader` and its size. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>

SergioBenitez closed this Oct 30, 2020

SergioBenitez added the pr: merged This pull request was merged manually. label Oct 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSE for Rocket 0.4.x #1365

SSE for Rocket 0.4.x #1365

ijackson commented Jul 5, 2020

igalic commented Jul 5, 2020

SergioBenitez commented Jul 23, 2020 •

edited

Loading

jebrosen commented Jul 23, 2020

ijackson commented Jul 23, 2020 via email

ijackson commented Sep 6, 2020 •

edited

Loading

SergioBenitez commented Sep 12, 2020

ijackson commented Sep 29, 2020

inzanez commented Oct 1, 2020

ijackson commented Oct 1, 2020

SergioBenitez commented Oct 1, 2020

ijackson commented Oct 2, 2020

jebrosen commented Oct 2, 2020

ijackson commented Oct 2, 2020 •

edited

Loading

ijackson commented Oct 2, 2020

ijackson commented Oct 28, 2020

SergioBenitez commented Oct 30, 2020

SergioBenitez commented Oct 30, 2020

ijackson commented Oct 30, 2020

SSE for Rocket 0.4.x #1365

SSE for Rocket 0.4.x #1365

Conversation

ijackson commented Jul 5, 2020

igalic commented Jul 5, 2020

SergioBenitez commented Jul 23, 2020 • edited Loading

jebrosen commented Jul 23, 2020

ijackson commented Jul 23, 2020 via email

ijackson commented Sep 6, 2020 • edited Loading

SergioBenitez commented Sep 12, 2020

ijackson commented Sep 29, 2020

inzanez commented Oct 1, 2020

ijackson commented Oct 1, 2020

SergioBenitez commented Oct 1, 2020

ijackson commented Oct 2, 2020

jebrosen commented Oct 2, 2020

ijackson commented Oct 2, 2020 • edited Loading

ijackson commented Oct 2, 2020

ijackson commented Oct 28, 2020

SergioBenitez commented Oct 30, 2020

SergioBenitez commented Oct 30, 2020

ijackson commented Oct 30, 2020

SergioBenitez commented Jul 23, 2020 •

edited

Loading

ijackson commented Sep 6, 2020 •

edited

Loading

ijackson commented Oct 2, 2020 •

edited

Loading