Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better stream error handling #184

Closed
juliangruber opened this issue Jan 13, 2014 · 36 comments
Closed

better stream error handling #184

juliangruber opened this issue Jan 13, 2014 · 36 comments

Comments

@juliangruber
Copy link
Contributor

Some ideas how to better handle streams, the biggest problem being error handling.

As in #180 (comment)

// what you have to do currently
// ugly as fuck
app.use(function*(){
  var a = streamA().on('error', this.onerror);
  var b = streamB().on('error', this.onerror);
  this.body = a.pipe(b).pipe(streamC());
});

With multipipe.

// with `multipipe`
app.use(function*(){
  this.body = pipe(streamA(), streamB(), streamC());
});
// for fs leaks, also accept functions as this.body
// and call them when no other handler is set
// leaky as fuck
app.use(function*(){
  this.body = fs.createReadStream.bind(null, 'file.html');
});
@jonathanong
Copy link
Member

you don't need to this.onerror.bind(this). we already do that for you! https://github.com/koajs/koa/blob/master/lib/application.js#L158

i don't think multipipe will really work because this.body = [] should always imply JSON. it would be confusing otherwise. i don't see any use-cases where this.body = multipipe(stream1, stream2) would actually be useful.

i don't want to add another body type.

i think right now, the easiest way is just to use domains (which i don't know enough about, but it should work), and attach streams to the domain when this.body = stream. everything else I though of has too edge cases:

set body(val) {
  this.domain.add(val)
}

then no more .onerror stuff for streams. we would still need .onerror for running stuff in separate "coroutines" or for general error handling in callbacks:

var body = this.body = new PassThrough()
co(function* () {
  yield setImmediate
  body.write('something')
  body.end()
}).call(this, this.onerror)

the biggest issue for me though is stuff like fd leaks, which is why i made really stupid utilities like https://github.com/stream-utils/dethroy and https://github.com/expressjs/on-socket-error (which handles close events as well. should probably rename the module).

@juliangruber
Copy link
Contributor Author

I updated this.onerror usage and removed the this.body = Array thing, yeah that was totally broken.

Multipipe does work as it just returns a stream, and doesn't have anything to do with the mentioned example. What I had there was an on how to integrate it into koa.

Adding streams to a domain on set body doesn't add other streams in that pipe to the domain, i.e. if you do

this.body = a.pipe(b)

only b will be added to the domain. Wrapping the streams in a pipe seems like the only option for that.

@jonathanong
Copy link
Member

gah domains are useless. i thought they handled that. i'd be +1 for a module that seeks through the pipe chain and adds error handlers for event emitters that don't have any.

@juliangruber
Copy link
Contributor Author

Domains would only work if you wrapped all app.use(fn) functions inside one, because streams created inside domain.run(fn) will be added to that domain. Unfortunately though, I heard domains impose quit an overhead.

@tj
Copy link
Member

tj commented Jan 13, 2014

I don't like the indirection they add either

@jonathanong
Copy link
Member

hmmm we could also "dump" any unused streams. i.e. in *respond

if (body.readable) {
  if (head) {
    body.resume()
    res.end()
  }
}

and we'll do the same when appropriate like overwriting body values with null. less than ideal, but it'll work!

@jeromew
Copy link
Contributor

jeromew commented Jan 16, 2014

Here is another option for stream error handling : https://npmjs.org/package/stream-try-catch

_try = require('stream-try-catch').try;
this.body = _try(s1).pipe(s2).pipe(s3).catch(this.onerror);

the try/catch block catches all errors emitted by s1, s2 or s3.

I'd be interested to hear your criticism on this (and happy to make modifications since I wrote this for my koa stream needs)

@tj
Copy link
Member

tj commented Jan 16, 2014

@jeromew I think the main problem is that errors don't propagate, if you don't have access to s1, the errors won't propagate to s2 or s3 so you're still f*#$ed

@tj
Copy link
Member

tj commented Jan 16, 2014

which I understand is somewhat by design since some streams would require specific cleanup etc, but more often than not I think they do/should clean up after themselves so I think conceptually it's a little broken. Also streams don't have any references to their "parents" right? so we're screwed as far as crawling upwards goes I think

@jeromew
Copy link
Contributor

jeromew commented Jan 16, 2014

@visionmedia yes if someone wants to do this.body = brokenStream where deep inside brokenStream something is not catched you're trapped I don't think there's much you can do about it.

only domains can do that but they have their own problems.

multipipe or try-catch can alleviate the problem.

what are you looking for exactly in koa regarding streams ?

@tj
Copy link
Member

tj commented Jan 16, 2014

just something that works nice out of the box would be ideal but it seems like streams are always going to be the wart

@jeromew
Copy link
Contributor

jeromew commented Jan 16, 2014

I just looked at the definition of pipe (https://github.com/joyent/node/blob/master/lib/_stream_readable.js#L461) the problem is that only readables have the static knowledge of the stream pipe graph in pipes = []

so indeed if you don't know s1, you're blocked.

do you only want to "silence" the emitted errors to try and catch any error before body is piped into res ?

I am sure there is some low impact monkey-patching that can be done to be informed of all the streams created but I don't know if you want to go this far to protect the users.

or node.js could patch the writable impl so that writables keep a list of all the readables that pipe into them. This way you could kind of rebuild the stream graph

do you want to publicize "cluster" in koa ? if the answer is yes, then a best practice could be that stream graph building should always take place outside of the middleware flow and the master would communicate with clusters via socket for example. in the worker, you can more easily use domain, keep a pool of workers and replace them when there is a massive leak.

that's not very easy to explain to users but that's scalable (socket to server on workers or remote servers is a config operation) + separates the koa middleware from the content delivery.

@tj
Copy link
Member

tj commented Jan 16, 2014

well the whole point of koa is to help mitigate node's brutal error handling, so anything domain/crash related doesn't help

@jonathanong
Copy link
Member

yeah at this point i give up. just have users handle the brutal stream handling however they see fit. it's not that difficult - we can just have really good docs (ugh) with all the edge cases we can think of.

@eivindfjeldstad
Copy link
Member

I took a stab at handling streams with domains, and got it to work reasonably well. A small patch to co was needed to retain the context of the error, but overall it seemed like a reasonable approach. Idk about performance though. The errors that are caught by domains have a domainThrown and a domainEmitter property, so it's possible to know if the error was thrown or emitted, and possibly prevent an undefined state. I think It's worth playing around with.

@eivindfjeldstad
Copy link
Member

Basically co needs to check process.domain and bind it to next. Then it's just a matter of creating a new domain in app.callback() and bind the fn returned by co to that domain. There are probably a few edge cases, but it seems to work okay.

@tj tj added this to the 1.0 milestone Mar 13, 2014
@jonathanong
Copy link
Member

how about adding a makeshift domain. i.e. an object with the same API, but not actually domains. something like:

var onerror = this.onerror
this.domain.add = function (emitter) {
  if (!~emitter.listeners.indexOf(onerror)) emitter.on('error', onerror)
}
this.domain.remove = function (emitter) {
  emitter.removeListener('error', onerror)
}

easier than doing .on('error', this.onerror) and making sure there's only one listener.

if people want to use domains for more robustness, this can just do this.domain = domain.create() as middleware.

not sure how to handle indirectly piped streams though.

@jonathanong
Copy link
Member

might be able to follow the pipe chain and add error handlers with nodejs/node-v0.x-archive#7243

@tj
Copy link
Member

tj commented Mar 27, 2014

There's these "zone" things Bert is working on for core, which would work but we wouldn't really be able to provide the same error handling semantics as the rest of koa related stuff :( kind of sucks but oh well for now I guess

@jonathanong
Copy link
Member

thoughts on #255? i think this is the best solution for now

@juliangruber
Copy link
Contributor Author

that takes care of a different case, def. doesn't fix the a.pipe(b).pipe(c) problem

@jeromew
Copy link
Contributor

jeromew commented Apr 10, 2014

agreed with @juliangruber . just to add as another option I found the https://www.npmjs.org/package/barrage module that replaces the .pipe() function by .syphon() propagating errors.
Nothing new compared to previously mentioned solutions but using the word "syphon" instead of "pipe" makes it easier to explain that "pipe" is not error-friendly

@jonathanong
Copy link
Member

i'm now of the opinion that this isn't a use-case koa should handle. either 1) this stuff should go into the lib layer, not koa's http layer, and should be handled there or 2) each intermediary stream should be middleware.

ex. you don't need a gzip stream (koa-compress) or an object stream -> json stream (koa-json), which is what most of these are.

@juliangruber
Copy link
Contributor Author

good point to wrap streams in koa middleware, something like this:?

app.use(route.get('/posts.sse', function*(next){
  this.body = db.posts.createReadStream();
  yield next;
}));
app.use(route.get('/posts.sse', stringify()));
app.use(route.get('/posts.sse', sse()));

@jonathanong
Copy link
Member

oh no. you'll want to make it "upstream" middleware.

app.use(function* (next) {
  // koa-json already does this
  yield* next

  var body = this.body
  if (!body || !body._readableState || !body._readableState.objectMode) return; 

  this.body = this.body.pipe(JSONStream.stringify())
})

app.get('/posts.see', function* (next) {
  // convert an object stream to an sse object stream, i'm guessing
  yield* next

  this.body = sse(this.body)
})

// the logic
app.get('/posts.sse', function* (next) {
  this.body = db.posts.createReadStream()
})

@juliangruber
Copy link
Contributor Author

dat order is very confusing, what's the advantage over my gist?

@jonathanong
Copy link
Member

what gist? that's how most middleware work. doing this.body=; yield next is weird imo because the route is already handled. you shouldn't need to execute additional middleware.

@juliangruber
Copy link
Contributor Author

app.use(route.get('/posts.sse', function*(next){
  this.body = db.posts.createReadStream();
  yield next;
}));
app.use(route.get('/posts.sse', stringify()));
app.use(route.get('/posts.sse', sse()));

vs

app.use(route.get('/posts.sse', stringify()));
app.use(route.get('/posts.sse', sse()));
app.use(route.get('/posts.sse', function*(next){
  this.body = db.posts.createReadStream();
}));

@jonathanong
Copy link
Member

ya i basically just wrote out the second

@juliangruber
Copy link
Contributor Author

is the order here just preference, or are there technical benefits to one?

@jonathanong
Copy link
Member

preference i guess. this.body= and yield next should never be used in the same clause and generator imo

@juliangruber
Copy link
Contributor Author

cool, then I'm all good with the status quo. Plus, with co-{read,write} and multipipe there often isn't a reason to use .pipe() at all.

@rtm
Copy link

rtm commented May 14, 2015

I am using the archiver library to stream zips. In express world, I did

var archive = new Archiver('zip');
archiver.pipe(res);
...add files
archiver.finalize();

What is the simplest way to do this in the koa world?

@yoshuawuyts
Copy link
Contributor

@rtm probably wrap it in a promise and yield it.

edit: whoops, this is silly. Correct answer below.

@juliangruber
Copy link
Contributor Author

@rtm

var archiver = new Archiver('zip');
this.body = archiver;
// ...
archiver.finalize();

@rtm
Copy link

rtm commented May 15, 2015

@juliangruber Thanks, how obvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants