Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink the stream, this avoids highWaterMark being hit - add test #126

Merged
merged 1 commit into from
Dec 16, 2015

Conversation

phated
Copy link
Member

@phated phated commented Nov 5, 2015

@contra I think I solved #120 with this. Can you take a look? I made the test and it was failing and then I made it pass. @davidchase could you also take a look at this?

Closes #120


function complete(err, file) {
cb(err, file);
if (!err) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems very similar to my proposition in #120 but can you shed some light for me how the error propagates or not in this situation?

is dealing with the stream being done or rather flushed as I proposed not ideal ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return value from push isn't a done flag (or that's what I think I found in all my digging). I tried implementing your suggestion in #120 but it didn't seem to work. The cb is being called on the line above with the err and file objects. If an error occurs and is passed to that, the error event should be emitted on the stream and the pipeline torn down. I check to see if there wasn't an error before reading because I don't know if reading from a torn down stream blows up or not.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm based on my understanding, on http://bit.ly/1kyt00f it says that readable will return a boolean which will say if more data should be pushed into the buffer which is controlled by the highWaterMark http://bit.ly/1kytDGX then as it hits false in the case of theres no piping happening we can then stream.read to pull chunks again from the buffer http://bit.ly/1kytTpp. When piping from a destination stream.read never gets called because the .pipe handles the draining of the readable part http://bit.ly/1kyvnjr

I've tried my way with a pipe from the dest and not piping essentially finishing the write part and it works..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidchase you are correct that the flag means more data can be pushed; however, that has no bearing on if we can read more. As soon as we are process the file object and pass it on (calling the callback with err & file), we are able to read another and process that. As far as I am aware, backpressure is handled by the reading stream and because we weren't reading or resuming, the beginning of our stream was clogging the pipeline. As you stated, the piping of dest to somewhere else will drain it, so the read will essentially nothing in that case (I think).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

; however, that has no bearing on if we can read more

Ahh i guess I misunderstood the following:

_read should start pushing that data into the read queue by calling this.push(dataChunk). _read should continue reading from the resource and pushing data until push returns false, at which point it should stop reading from the resource...

https://nodejs.org/api/stream.html#stream_readable_read_size_1

so the read will essentially nothing in that case (I think).

yeah looks like it returns null when pipe from dest

thanks for clarifying

edit
based on the above and since stream.push(chunk) + cb() === cb(null, chunk) would you say this is fair?

   function complete(err, file) {
       var keepReading = stream.push(file);
       if (keepReading) {
           stream.read();
       }
       cb(err);
   }

since keepingReading will be true we can continue reading from the resource

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidchase After consulting the stream docs that seems like it should work. We should make sure our stream implementation is in line with the latest docs

@yocontra
Copy link
Member

yocontra commented Nov 8, 2015

@phated ping me when you need a look

@phated
Copy link
Member Author

phated commented Nov 9, 2015

@contra can you review the comments between @davidchase and I and help us figure out the correct solution here? My stream-fu seems to be weak.

@yocontra
Copy link
Member

yocontra commented Nov 9, 2015

Just commented. Also: can we avoid the cruft of having a ton of fixture files and make those dummy files in code? This will also make it easier to test different #s later on down the road

@davidchase
Copy link

any luck with this one?

@phated
Copy link
Member Author

phated commented Nov 24, 2015

I reached out to @chrisdickinson and @nrn on IRC but I don't think either was available to give any guidance.

@nrn
Copy link

nrn commented Nov 24, 2015

Sorry, checked it out but I haven't dealt with the subtleties around this enough to have any insight. Good luck!

@phated phated force-pushed the read-next-file branch 2 times, most recently from b87d645 to 8435bbe Compare December 10, 2015 23:00
@phated
Copy link
Member Author

phated commented Dec 10, 2015

@davidchase it seems your suggestion passes the test I added so I switched to that. Also updated as per @contra's suggestions and rebased.

Let me know what you think

@phated
Copy link
Member Author

phated commented Dec 10, 2015

@davidchase can you think of a test for when stream.push() will return false? I'd like to add that to the suite if we can come up with one.

@davidchase
Copy link

@phated i think the only time stream.push would return false is in the case when we would be using the internal buffer which maxes out at the highWaterMark (16 files/objects) then it cant push anymore data and it returns false.

But if we are using stream.read() we are effectively bypassing the internal buffer altogether therefore i dont think stream.push will ever return false...

Im also wondering if bypassing the buffering is ideal??

@phated
Copy link
Member Author

phated commented Dec 14, 2015

@davidchase I thought this.read() would pull from the previous streams' buffer or queue up a read if there is nothing in it, not empty our buffer. And I thought this.push() pushed things onto our buffer (that the next stream pulls from) and returned false if we were at the max (however, I couldn't figure out how to write a test that showed that part).

@phated
Copy link
Member Author

phated commented Dec 14, 2015

k, went through this with chris and he explained what is happening in the current implementation and the best way to fix this. I will update this PR and try to add another test soon.

@davidchase
Copy link

Sounds good 👍

Im curious to see how it looks after you update the PR

@phated phated force-pushed the read-next-file branch 2 times, most recently from 9ab7372 to 6ea0b79 Compare December 16, 2015 00:07
@phated
Copy link
Member Author

phated commented Dec 16, 2015

@contra @davidchase I've updated with @chrisdickinson's suggestion and added another test. Can you take a look again.

@davidchase
Copy link

@phated interesting can you explain a little bit of whats happening... are you taking the through stream and just piping it a sink that just calls the cb ?

this seems very similar to the stream-exhaust approach, is that a correct assumption ?

@yocontra
Copy link
Member

Review, LGTM - thanks @phated 🙏

@phated
Copy link
Member Author

phated commented Dec 16, 2015

@davidchase I believe a pipeline must end in a writeable stream to function correctly, so this was the suggestion from @chrisdickinson. I asked how it was different than just calling .resume() to which he replied

it should be a bit more durable than .resume in that it will count as one of the downstream pipes, so even if another dest stream is unpiped the saveStream will continue flowing

and then further on in the discussion, he said the following, which I believe sums up the change nicely:

to be on the safe side I'd probably just pipe saveStream to an infinitely fast writable sink — that way it will only go as fast as the slowest stream it is piped to, but if it is unpiped it will continue to function

You are correct that it is similar to the stream-exhaust approach (I even asked chris about that) and I considered using it as a dependency, but since we are in control of the streams, we shouldn't need the fallback to the .resume call. The stream-exhaust module is still used deep inside undertaker incase people return a stream that ends with a transform stream that isn't gulp.dest but this change will make vinyl-fs stream correctly when not inside a gulp task.

@phated
Copy link
Member Author

phated commented Dec 16, 2015

:shipit:

phated added a commit that referenced this pull request Dec 16, 2015
read from the stream once we finish processing a file, this avoids highWaterMark being hit - add test
@phated phated merged commit c0dfd14 into master Dec 16, 2015
@phated phated deleted the read-next-file branch December 16, 2015 19:18
@asgoth
Copy link

asgoth commented Jan 7, 2016

Any idea when this fix will be in a new version of vinyl-fs?

@phated
Copy link
Member Author

phated commented Jan 7, 2016

Soon. I'm dealing with some personal stuff right now but hoping to dive back in soon.

@phated phated changed the title read from the stream once we finish processing a file, this avoids highWaterMark being hit - add test sink the stream, this avoids highWaterMark being hit - add test Jan 27, 2016
phated added a commit that referenced this pull request Dec 5, 2017
read from the stream once we finish processing a file, this avoids highWaterMark being hit - add test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants