refactor merge to allow late discovery of sources #133

jeromew · 2014-10-01T20:38:35Z

@caolan here is a PR for a refactor of merge that passes all tests but avoids the toArray pre-buffering of sources. Tell me what you think.

this is related to issue #132

jeromew · 2014-10-06T09:02:17Z

@caolan just a ping on my lazy merge PR I know you are very busy outside of highlandjs but I would like your validation that there is a chance for it to be accepted upstream before relying on it. Thanks !

caolan · 2014-10-06T09:48:43Z

@jeromew yes, the code looks good to me. I'm going to run it locally for a bit then cut a new release. Good work :)

jeromew · 2014-10-06T10:09:21Z

@caolan just to make sure you got the correct version of the PR I just made a modification to it (did not think you would get to it this early) because I had a timing bug with a specific stream :

else if (x === _.nil) {
                more_sources = false;
                _push(null, x);
                safeNext(); <= added a next() here otherwise there was a case when the stream of stream ended after the last pull
            }

apaleslimghost · 2014-11-22T08:58:51Z

This hit the 0.11 build failure, I've rebuilt it.

apaleslimghost · 2014-11-24T15:13:38Z

There appears to be a commented-out test for this behaviour on master, does it pass in this branch?

Is it worth adding the noValueOnError test?

jeromew · 2014-11-24T15:37:01Z

I'll check this

jeromew · 2014-11-25T14:35:17Z

@caolan I am reviewing my implementation of merge with late discovery of sources. All tests pass but their is something in your original implementation that flowed into mine that I would like to make sure I understand before having others review it.

you created a safeNext method probably to call next "safely".

I understand that the way it is implemented, it is a way to make sure that all "synchronous" sources that may want to send a token should be able to do so before next is called. In that sense, it is could be called fairNext

Initially I thought that "safeNext" was a way to guarantee that next is called only one time for each call to the consume callback, but it seems to me that next could be called many times if several async sources are delayed with a reference to safeNext. In that sense, it does not seem "next is only called one time safe"

I have a hard time telling whether it is safe or not to call "next" several times for one call to the consume callback so I can't figure out which "safe" we are talking about and if I can rename safeNext to fairNext

could you help me understand this ?

jeromew · 2014-12-05T16:33:22Z

@caolan there is another way to implement merge by using pull instead of the whole .on('...') thing.
The basic idea can be seen in this commit - vqvu@1d074f8#diff-6d186b954a58d5bb740f73d84fe39073L2817

(only look at merge). do you agree that it feels cleaner than the on('...') approach ? I have a working implementation of this way of doing things + late discovery of sources

jeromew · 2015-01-03T21:11:31Z

@vqvu I went through several version of PR. I ended up taking your algorithm from the 3.0.0 branch and adapting it to allow for late discovery of sources.

I would appreciate if you could take some time to review it ; we could merge this into the 2.x branch and remove it from the patchset of 3.0.0

vqvu · 2015-01-04T03:51:11Z

I don't think it's necessary to look at _incoming.length. We should be able to implement all transforms (besides the core consume) without having to look at private variables.

I also don't think it's correct for two reasons:

consume will use the _outgoing queue in certain circumstances.
A stream constructed with a generator function can generate sources synchronously, but not until the generator is called, so looking in _incoming doesn't work.

An example of (2) is

_(function (push, next) {
    push(null, _([1, 2, 3]));
    push(null, _([3, 4, 5]));
    push(null, _([6, 7, 8]));
    push(null, _([9, 10, 11]));
    push(null, _([12, 13, 14]));
    push(null, _.nil);
}).merge().toArray(_.log);

// Should be
// => [ 1, 3, 6, 9, 12, 2, 4, 7, 10, 13, 3, 5, 8, 11, 14 ]
// is
// => [ 1, 3, 2, 6, 4, 3, 9, 7, 5, 12, 10, 8, 13, 11, 14 ]

I think the solution is to iteratively call pull until there is async behavior, then go into the main merge loop. See getSourcesSync in https://gist.github.com/vqvu/a7838e456783432a2e45.

jeromew · 2015-01-05T10:09:04Z

@vqvu thanks for the review. I agree that using the internal variables was not a good idea (and I was missing the subtlety of _incoming/_outgoing in the case you mentionned).

I squashed the PR and added :

the new test that you mentionned (generation of sync sources via a generator)
your algo from the gist

I simply moved the called to 'next()' from the pullFromAllSources function to the main generator loop. It looked more readable to me.

vqvu · 2015-01-05T12:33:15Z

Awesome! Looks good to me.

vqvu · 2015-01-05T12:56:37Z

Actually, I forgot something. There is a test on 3.0.0 called "pass through errors (issue #141)" that you should copy over too.

jeromew · 2015-01-05T13:33:06Z

@vqvu I backported the test from 3.0.0 and squashed the PR.

refactor merge to allow late discovery of sources

jeromew force-pushed the merge branch from c83be7b to 645d6e2 Compare October 6, 2014 10:03

jeromew mentioned this pull request Nov 29, 2014

Fix consume and redirect. #175

Closed

jeromew mentioned this pull request Jan 2, 2015

Highland v3.0.0 #179

Open

34 tasks

jeromew force-pushed the merge branch 3 times, most recently from ad7833b to 6a08e85 Compare January 3, 2015 21:06

jeromew force-pushed the merge branch from 6a08e85 to fa38b34 Compare January 5, 2015 09:53

refactor merge to allow late discovery of sources

4b6905c

jeromew force-pushed the merge branch from fa38b34 to 4b6905c Compare January 5, 2015 13:29

jeromew added a commit that referenced this pull request Jan 5, 2015

Merge pull request #133 from jeromew/merge

6f9bb4c

refactor merge to allow late discovery of sources

jeromew merged commit 6f9bb4c into caolan:master Jan 5, 2015

vqvu mentioned this pull request Jan 5, 2015

Merge doesn't seem to pass errors #141

Closed

LewisJEllis mentioned this pull request Jan 8, 2015

Problem with the new merge strategy #132

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor merge to allow late discovery of sources #133

refactor merge to allow late discovery of sources #133

jeromew commented Oct 1, 2014

jeromew commented Oct 6, 2014

caolan commented Oct 6, 2014

jeromew commented Oct 6, 2014

apaleslimghost commented Nov 22, 2014

apaleslimghost commented Nov 24, 2014

jeromew commented Nov 24, 2014

jeromew commented Nov 25, 2014

jeromew commented Dec 5, 2014

jeromew commented Jan 3, 2015

vqvu commented Jan 4, 2015

jeromew commented Jan 5, 2015

vqvu commented Jan 5, 2015

vqvu commented Jan 5, 2015

jeromew commented Jan 5, 2015

refactor merge to allow late discovery of sources #133

refactor merge to allow late discovery of sources #133

Conversation

jeromew commented Oct 1, 2014

jeromew commented Oct 6, 2014

caolan commented Oct 6, 2014

jeromew commented Oct 6, 2014

apaleslimghost commented Nov 22, 2014

apaleslimghost commented Nov 24, 2014

jeromew commented Nov 24, 2014

jeromew commented Nov 25, 2014

jeromew commented Dec 5, 2014

jeromew commented Jan 3, 2015

vqvu commented Jan 4, 2015

jeromew commented Jan 5, 2015

vqvu commented Jan 5, 2015

vqvu commented Jan 5, 2015

jeromew commented Jan 5, 2015