-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
back-propagation of _.nil #172
Comments
Do you mean: if (err !== null) {
push(err);
next();
}
else if (x === _.nil) {
resource.close();
push(null, x);
}
else {
push(null,x);
next();
} |
Yes, sorry for the typo, here is an example showing the "bug" var _ = require('./lib/index.js')
_([1,2,3,4,5])
.consume(function(err, x, push, next) {
if (err !== null) {
push(err);
next();
}
else if (x === _.nil) {
console.log('nil')
push(null, x);
}
else {
push(null,x);
next();
}
})
.take(2)
.resume() I would have expected 'nil' to be logged but was surprised it is not. The same with the take(2) before the consume logs 'nil' |
I see, but when var _ = require('./lib/index');
var s = _([1,2,3,4,5])
.consume(function(err, x, push, next) {
if (err !== null) {
push(err);
next();
}
else if (x === _.nil) {
console.log('nil')
push(null, x);
}
else {
push(null,x);
next();
}
});
s.take(2).toArray(function (xs) {
console.log(xs)
s.take(3).toArray(function (xs) {
console.log(xs);
s.resume();
});
}); |
Should we even allow streams to be pulled from again? This behavior results in situations like var s = _([1, 2, 3, 4, 5]);
s.fork().take(1).toArray(_.log);
// => prints [1]
s.fork().take(2).toArray(_.log);
// => prints nothing. The shared backpressure causes the second fork to never get the second element, since the first fork could be consumed later on. |
I've been defining all of my forks before consuming them, e.g.: var s = ([1, 2, 3, 4, 5]); sFork2.take(2).toArray(_.log); But I'd be interested in hearing best practices for this case. It seems On Sat, Nov 29, 2014 at 3:44 PM, vqvu notifications@github.com wrote:
|
@ariutta Can you give a better example of what you're trying to do, as what you seem to be asking doesn't have anything to do with functional vs. not functional? If you're just asking about whether you should assign your forks to a variable, I think the answer is "no unless you need to reference your fork later on." Otherwise, just take advantage of the chaining API instead of creating superfluous variables that you never really need. I don't think I've ever run into a situation where I needed a reference to my fork later on. |
@vqvu, your second paragraph answered my question. Thanks! |
@vqvu the way I currently use highland is :ns
Ideally, each "transform" would have its own setup/teardown + an awareness of the global graph it is inserted into. In my example above, I thought there there could be a sort of back-propagation of the fact that the transforms will never be pulled from again. but that is very fuzzy, now that I understand @LewisJEllis 's point (and yours) that streams can be re-used (I was under the wrong understanding that they had to be discarded when downstream decided everything was over) I'll let you close this issue as it seems that my question is not a real issue. |
I've been looking at using this library in a project and ran into this exact problem, there's not a safe time to cleanup a resource. An alternative to forcing streams to be one time use only might be to take the approach bacon.js does. That is, have the user provide a function that allocates the resource and gets values, and have the user provide a function to cleanup the resource. Make it explicit that their "subscribe" function may be called multiple times (i.e. if a stream is stopped and then re-used the "subscribe" function would be called again). Disclaimer: I haven't used bacon.js quite yet (just read some docs / source), as I wanted to try highland first. |
I also ran into a case sort of like this recently and thought about this a bit, esp. the question "Should we even allow streams to be pulled from again?" I haven't managed to make a strong use case for the current way of allowing it - the example I gave above was just to make a point, and I can pull off the same thing using forks anyway if we disallow multiple pulls; this piece from my example above: s.take(2).toArray(function (xs) {
console.log(xs)
s.take(3).toArray(function (xs) {
console.log(xs);
s.resume();
});
}); could become: s.fork().take(2).toArray(_.log)
// now we know the first fork is all done
s.fork().drop(2).take(3).toArray(_.log) (assuming Does anyone have a good example of where multiple consumptions/thunks on the same stream is useful/necessary? /cc @vqvu, @caolan @jeromew I like the possibility that a stream should effectively "be discarded when downstream decide(s) everything (is) over" where "downstream decides everything is over" = "a thunk happens". In other words - can we safely add the restriction that every stream (or fork) only ever has one thunk? Then, once that thunk happens, we can back-prop Barring that possibility, the setup/teardown handler idea is a good next option, but it feels weird to me for some reason that I haven't quite put my finger on. See also: relevant discussion in #161 on how |
I'm not a fan of allowing multiple thunks for a single stream for the exact reasons mentioned in this issue; I've never had to do it myself. So I would support changing that behavior to allow for back-prop of |
Question about semantics. Will there be a way to differentiate between "values are no longer needed" and "this is the end of the stream, but continue to push values". In the first you'd want to cleanup resources, immediately but in the second you'd probably want to wait.Let me know if an example use case is needed. -- On Sat, Jan 3, 2015 at 12:23 PM, vqvu notifications@github.com wrote:
|
I don't understand your latter case...how can you continue to push values after the end of the stream? |
In particular I want to re-implement unix-sort using highland. The unix-sort library spawns the sort command and passes all the stream items to it. When it gets I guess in this case it could cleanup in the In pseudo-code it would look like this:
|
There is an open PR on sort : #169 what benefit do you see of using the unix-sort memory-wise. In both case it seems to me that you need to have the whole thing in memory just in case the last item becomes the first after sorting. But you are right on the "cancel" pattern. Right now |
@ibash Ok, i get what you're saying. There will need be a way to differentiate between getting a Perhaps what we need is an |
@vqvu you mean for var filter = function (f, source) {
return source.consume({
onDone: function (push, next) {
push(null, _.nil);
},
handle: function (err, x, push, next) {
if (err) {
// pass errors along the stream and consume next value
push(err);
next();
}
else {
// pass on the value only if the value passes the predicate
if (f(x)) {
push(null, x);
}
next();
}
}
});
}; My first thought was to use an additional special value (like Another option might be to make the idea of a cancel live inside the idea of a nil - replace I think out of these three options, I like the handlers the best, since consumes could have the option of just not passing the handlers if they don't have special behavior (and they'd default to something like |
I'm not sure on the exact API yet, I like handlers better than special values too. Using a special value may be complicated in implementation. I was thinking more along the lines of stream.consume(...).onCancel(...);
// Unlike done, onDone just registers a handler. It doesn't cause a thunk.
stream.consume(...).onDone(...); I like this for a few reasons:
We could also go with your handlers-to-consume option and assume that an undefined |
@vqvu I dig that syntax. Still on the fence about having a separate error handler, though you can do that with the current api using |
At the very least, how about |
Or we could have |
Writing It also won't cover the important "generator holding a resource" case. Plus, some transforms are implemented with |
After working with highland a bit more I think I prefer the syntax recommended here: #172 (comment) I'm also wondering if the cancel is actually needed -- yes, you could waste a bit of work, but I'm not sure the complication of having separate handlers out weighs the benefits. If there was an |
The chaining syntax
Why do you prefer the other syntax? Not sure about having separate handlers. The "clean up resources" and "fork" use cases that we have would work with just a unified How about having a unified |
Or just don't bother passing a reason at all until someone comes up with a good use case for doing so... The minimalist in me probably prefers this more. |
In my comment above I linked to the chaining syntax -- I prefer it as well, sorry for the misunderstanding. |
Oh, sorry! My browser must have been misbehaving. |
The case I'm always hitting is needing to rebind destroy to the last transform in a chain so I can destroy the resource as well. something like:
@vqvu You are correct that we would have to resume to get the Given the Say we had a stream that could have forks that are added and destroyed dynamically. How would one decide the difference between "this fork is done" and "this stream is done"? |
@jgrund what do you use forks for? I'm curious as to whether they could be replaced with an observe...? |
@ibash |
FWIW I am using them to present data differently in different parts of my application. |
@jgrund you're right that you wouldn't be able to add forks dynamically. The stream would be destroyed once all forks are destroyed. The way I see highland being used is that you set up all of the transforms that you need, then call something that causes a Are you doing anything that really necessitates dynamic creation of forks after data has started flowing? |
Also, one of the new restrictions in 3.0 will be that once you add a transform to a stream (not a fork), you cannot add another. More specifically, This is precisely to solve a similar problem we had with an early-terminating transform like var s = _([1, 2, 3]);
// Take one
s.take(1).toArray(function (a) {
// Take two more. This second take works now,
// but won't in 3.0.
s.take(2);
}); |
After some thought, I guess you would need dynamic forking if your views were dynamically created. And you may want the shared back pressure to synchronize your views. This is probably a good enough case to allow something like dynamic forking. Perhaps a
The streams are synchronized via shared back pressure like a fork, but
Implementation-wise, the current 3.0 code already uses a similar kind of "fork manager", so this should be easy to do. |
@vqvu Nice, I like that approach. One thing I would add is it would be nice if |
I'm hesitant to add specific functionality like that, since other people may not want it. One way to get what you want is to just implement it yourself. Something like this var s = makeSource();
var latest = _.nil;
s.observe().consume(function (err, x, push, next) {
if (!err) {
latest = x;
}
if (x !== _.nil) {
next();
}
}).resume();
var broadcast = s.broadcast();
function newViewer() {
var view = broadcast.new();
return _(function (push, next) {
if (latest !== _.nil) {
push(null, latest);
}
next(view);
});
} On Feb 24, 2015 9:47 AM, "Joe Grund" notifications@github.com wrote:
|
Also, if you don't need the shared back pressure, using observers will also work. Observers will never destroy their source. This fits with the current behavior since they don't |
Is there any consensus on what this api should look like? |
I like s.onDestroy(function () { /* Do destroy stuff */ })
.map(...)
...; Called after I think we should go this way unless there is a significant objection (no comment == implicit approval). |
Sounds good to me. |
+1 |
Hello,
I have been trying the following scenario :
I open a resource, consume from it, and wait for _.nil to close the resource.
This works well if I consume all the tokens with
now if i do
The resource is never closed, because take seem to send _.nil downstream but not inform upstream that they will not be pulled from again.
You might say that I could close the resource by catching the _.nil after take(10) but this is not what I want to do because I do not want the downstream code to have a reference to the resource.
Would it make sense to have back-propagation of _.nil or a mechanism to inform upstream streams that they will not be pulled from again ?
In node.js streams, when you do
when s2 sends a 'close' event, s1 automatically unpipes s2 for example and you have a way to detect that nothing will be pulled again.
https://github.com/joyent/node/blob/master/lib/_stream_readable.js#L568
The text was updated successfully, but these errors were encountered: