Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

Allow access to raw output stream #3

Open
neoeinstein opened this issue Aug 9, 2017 · 9 comments
Open

Allow access to raw output stream #3

neoeinstein opened this issue Aug 9, 2017 · 9 comments

Comments

@neoeinstein
Copy link
Member

neoeinstein commented Aug 9, 2017

Currently Representation only allows for supplying a byte array, which is then streamed out over the wire. This means that the byte array must be pre-constructed. This is not always convenient or efficient, especially for larger replies.

Proposal: Alter Representation by changing the type of the Data field from byte array to ResponseData. ResponseData would be defined as:

type ResponseData =
  | Bytes of byte array
  | DelayedStream of Stream -> Async<unit> // Job<unit> for Hopac

This will allow for streaming responses, which may mean less memory pressure in the long run. (I see an opportunity for Chiron to be able to stream its data to the wire in this way.)

This would be a breaking change, requiring the addition of the word Bytes to existing code that explicitly creates Representation records.

@et1975
Copy link

et1975 commented Aug 9, 2017

Would love some way to stream large content and at the moment I can't find a way to do this correctly. I'm not sure what my impl would look like given the Stream -> Async<unit> signature though or how would it fit with the rest of freya computations... could use an example.

@panesofglass
Copy link
Contributor

In general, I'm a big fan of this change. I wonder, though, whether the DelayedStream has the right signature. I would also like to see an example. Might a Freya computation be a better abstraction to return?

@neoeinstein
Copy link
Member Author

neoeinstein commented Aug 9, 2017

The degenerate implementation of a delayed stream would take a simple byte array. When using ByteArray, the default output for the representation would need to set a Transfer-Encoding: chunked header, since the content length won't be known before sending the content.

let degenerateCase bytes (stream: System.IO.Stream) =
  async {
    do! stream.WriteAsync(bytes, 0, bytes.Length) |> Async.AwaitIAsyncResult |> Async.Ignore
  }

let represent bytes =
  { Data = DelayedStream (degenerateCase bytes)
    Description = {} }

Substitute degenerateCase with some function that takes your DTO and sends it to a stream, and you have a delayed write.

@neoeinstein
Copy link
Member Author

@panesofglass Not sure that it would. It could be something that also gets some of the Freya state, but that would complicate the signature when we could just rely on the caller injecting the necessary dependencies through partial application.

Once we begin writing to the stream, the status and headers get sent, so all attributes of the response are no longer editable. It doesn't make much sense to expose them here if the consumer can't really use them.

@et1975
Copy link

et1975 commented Aug 9, 2017

since the content length won't be known before sending the content.

I don't know if this would be universally true, in my current scenario it is known (but large).

rely on the caller injecting the necessary dependencies through partial application.

how would this work if params injected are freya computations themselves?

@neoeinstein
Copy link
Member Author

@et1975 On content length, we could have DelayedStream take an additional ContentLength = KnownLength of uint64 | UnknownLength parameter. If a KnownLength, it would give a Content-Length header. If not, it would emit a Transfer-Encoding: chunked header.

On injections from freya computations, you can provide them in the freya {…} computation where you create the representation.

let handleOk =
  freya {
    let! method = Freya.Optic.get Request.method_
    return
      { Data = DelayedStream (useMethodInOutputStream method)
        Description = {} }
  }

@neoeinstein
Copy link
Member Author

/cc @ajgajg1134 from my own team

@btrepp
Copy link

btrepp commented Aug 14, 2017

Is the motivation to get memory usage down (eg remove preallocation? eg byte[] seq) or to have the server send more push-like data? (Async<byte[]> seq or some sort of Rx/FRP/Event based system?.)

Usually in my mind chunked-encoding looks like
type ChunkedData = byte[] seq
Which would satisfy processing a chunk at a time, with a seq based serialize, plus its easy enough to fall back/have Freya handle the encoding automatically.

Also https://tools.ietf.org/html/rfc2616#section-3.6. Implies that headers MAY be written after the body

      Chunked-Body   = *chunk
                        last-chunk
                        trailer
                        CRLF

       chunk          = chunk-size [ chunk-extension ] CRLF
                        chunk-data CRLF
       chunk-size     = 1*HEX
       last-chunk     = 1*("0") [ chunk-extension ] CRLF

       chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
       chunk-ext-name = token
       chunk-ext-val  = token | quoted-string
       chunk-data     = chunk-size(OCTET)
       trailer        = *(entity-header CRLF)

This stack overflow has an example of it
https://stackoverflow.com/questions/5590791/http-chunked-encoding-need-an-example-of-trailer-mentioned-in-spec

I think if it's a more event-y (eg Server Sent Events thing) maybe it should be outside of Representation, and use the static overloads to allow handleX to use it.

Eg It now takes Freya or Representation. So maybe it can take Freya or the like. That way existing code doesn't break.

@et1975
Copy link

et1975 commented Aug 14, 2017

Since we are looking at breaking changes, it would be ideal if it could also resume correctly (support range headers). @neoeinstein would that still fit into Representation?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants