Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor merge to allow late discovery of sources #133

Merged
merged 1 commit into from
Jan 5, 2015

Conversation

jeromew
Copy link
Collaborator

@jeromew jeromew commented Oct 1, 2014

@caolan here is a PR for a refactor of merge that passes all tests but avoids the toArray pre-buffering of sources. Tell me what you think.

this is related to issue #132

@jeromew
Copy link
Collaborator Author

jeromew commented Oct 6, 2014

@caolan just a ping on my lazy merge PR I know you are very busy outside of highlandjs but I would like your validation that there is a chance for it to be accepted upstream before relying on it. Thanks !

@caolan
Copy link
Owner

caolan commented Oct 6, 2014

@jeromew yes, the code looks good to me. I'm going to run it locally for a bit then cut a new release. Good work :)

@jeromew
Copy link
Collaborator Author

jeromew commented Oct 6, 2014

@caolan just to make sure you got the correct version of the PR I just made a modification to it (did not think you would get to it this early) because I had a timing bug with a specific stream :

else if (x === _.nil) {
                more_sources = false;
                _push(null, x);
                safeNext(); <= added a next() here otherwise there was a case when the stream of stream ended after the last pull
            }

@apaleslimghost
Copy link
Collaborator

This hit the 0.11 build failure, I've rebuilt it.

@apaleslimghost
Copy link
Collaborator

There appears to be a commented-out test for this behaviour on master, does it pass in this branch?

Is it worth adding the noValueOnError test?

@jeromew
Copy link
Collaborator Author

jeromew commented Nov 24, 2014

I'll check this

@jeromew
Copy link
Collaborator Author

jeromew commented Nov 25, 2014

@caolan I am reviewing my implementation of merge with late discovery of sources. All tests pass but their is something in your original implementation that flowed into mine that I would like to make sure I understand before having others review it.

you created a safeNext method probably to call next "safely".

I understand that the way it is implemented, it is a way to make sure that all "synchronous" sources that may want to send a token should be able to do so before next is called. In that sense, it is could be called fairNext

Initially I thought that "safeNext" was a way to guarantee that next is called only one time for each call to the consume callback, but it seems to me that next could be called many times if several async sources are delayed with a reference to safeNext. In that sense, it does not seem "next is only called one time safe"

I have a hard time telling whether it is safe or not to call "next" several times for one call to the consume callback so I can't figure out which "safe" we are talking about and if I can rename safeNext to fairNext

could you help me understand this ?

@jeromew
Copy link
Collaborator Author

jeromew commented Dec 5, 2014

@caolan there is another way to implement merge by using pull instead of the whole .on('...') thing.
The basic idea can be seen in this commit - vqvu@1d074f8#diff-6d186b954a58d5bb740f73d84fe39073L2817

(only look at merge). do you agree that it feels cleaner than the on('...') approach ? I have a working implementation of this way of doing things + late discovery of sources

@jeromew jeromew mentioned this pull request Jan 2, 2015
34 tasks
@jeromew jeromew force-pushed the merge branch 3 times, most recently from ad7833b to 6a08e85 Compare January 3, 2015 21:06
@jeromew
Copy link
Collaborator Author

jeromew commented Jan 3, 2015

@vqvu I went through several version of PR. I ended up taking your algorithm from the 3.0.0 branch and adapting it to allow for late discovery of sources.

I would appreciate if you could take some time to review it ; we could merge this into the 2.x branch and remove it from the patchset of 3.0.0

@vqvu
Copy link
Collaborator

vqvu commented Jan 4, 2015

I don't think it's necessary to look at _incoming.length. We should be able to implement all transforms (besides the core consume) without having to look at private variables.

I also don't think it's correct for two reasons:

  1. consume will use the _outgoing queue in certain circumstances.
  2. A stream constructed with a generator function can generate sources synchronously, but not until the generator is called, so looking in _incoming doesn't work.

An example of (2) is

_(function (push, next) {
    push(null, _([1, 2, 3]));
    push(null, _([3, 4, 5]));
    push(null, _([6, 7, 8]));
    push(null, _([9, 10, 11]));
    push(null, _([12, 13, 14]));
    push(null, _.nil);
}).merge().toArray(_.log);

// Should be
// => [ 1, 3, 6, 9, 12, 2, 4, 7, 10, 13, 3, 5, 8, 11, 14 ]
// is
// => [ 1, 3, 2, 6, 4, 3, 9, 7, 5, 12, 10, 8, 13, 11, 14 ]

I think the solution is to iteratively call pull until there is async behavior, then go into the main merge loop. See getSourcesSync in https://gist.github.com/vqvu/a7838e456783432a2e45.

@jeromew
Copy link
Collaborator Author

jeromew commented Jan 5, 2015

@vqvu thanks for the review. I agree that using the internal variables was not a good idea (and I was missing the subtlety of _incoming/_outgoing in the case you mentionned).

I squashed the PR and added :

  • the new test that you mentionned (generation of sync sources via a generator)
  • your algo from the gist

I simply moved the called to 'next()' from the pullFromAllSources function to the main generator loop. It looked more readable to me.

@vqvu
Copy link
Collaborator

vqvu commented Jan 5, 2015

Awesome! Looks good to me.

@vqvu
Copy link
Collaborator

vqvu commented Jan 5, 2015

Actually, I forgot something. There is a test on 3.0.0 called "pass through errors (issue #141)" that you should copy over too.

@jeromew
Copy link
Collaborator Author

jeromew commented Jan 5, 2015

@vqvu I backported the test from 3.0.0 and squashed the PR.

jeromew added a commit that referenced this pull request Jan 5, 2015
refactor merge to allow late discovery of sources
@jeromew jeromew merged commit 6f9bb4c into caolan:master Jan 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants