Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: suppress data event if req aborted #13260

Closed

Conversation

yhwang
Copy link
Member

@yhwang yhwang commented May 28, 2017

Re-enable test-http-abort-stream-end and put it into parallel
category. Use system random port when calling server.listen()
and fix eslint errors.

Since request.abort() turns on the '_dumped' flag in the
IncomingMessage, Readable.prototype.read() needs to check
the '_dumped' flag before calling the emit('data').

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

test, streams

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label May 28, 2017
@mscdex mscdex added the http Issues or PRs related to the http subsystem. label May 28, 2017
@Trott
Copy link
Member

Trott commented May 28, 2017

@Trott
Copy link
Member

Trott commented May 28, 2017

Failed on Linux-One: https://ci.nodejs.org/job/node-test-commit-linuxone/6210/nodes=rhel72-s390x/console

not ok 461 parallel/test-http-abort-stream-end
  ---
  duration_ms: 0.108
  severity: fail
  stack: |-
    assert.js:60
      throw new errors.AssertionError({
      ^
    
    AssertionError [ERR_ASSERTION]: got data after abort
        at IncomingMessage.<anonymous> (/data/iojs/build/workspace/node-test-commit-linuxone/nodes/rhel72-s390x/test/parallel/test-http-abort-stream-end.js:46:7)
        at emitOne (events.js:115:13)
        at IncomingMessage.emit (events.js:210:7)
        at IncomingMessage.Readable.read (_stream_readable.js:462:10)
        at flow (_stream_readable.js:833:34)
        at resume_ (_stream_readable.js:815:3)
        at _combinedTickCallback (internal/process/next_tick.js:102:11)
        at process._tickCallback (internal/process/next_tick.js:161:9)

It may be challenging to get this to work on all platforms, but who knows, let's see how the rest of the CI results are when it's done running....

@Trott
Copy link
Member

Trott commented May 28, 2017

Lots of failures on Linux. The pattern seems to be that it passes on older variants (CentOS 5, Fedora 22, Ubuntu 12-04) but fails on newer variants (CentOS 6 and 7; Fedora 23, 24, and 25; Ubuntu 14-04 and 16-04).

@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

@Trott Thanks for the investigation. My next step is to use Ubuntu 14-04 or Ubuntu 16-4 to recreate the failure. Or do you have any suggestion?

@Trott
Copy link
Member

Trott commented May 28, 2017

My next step is to use Ubuntu 14-04 or Ubuntu 16-4 to recreate the failure. Or do you have any suggestion?

If you're able to do that, then that seems like a good way to go.

/cc @nodejs/testing in case anyone else has additional suggestions or ideas about what the issue is.

@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

@Trott I can only use docker version at this moment. I don't know if it could recreate the failure. However, that's the only thing I can try now (it's holiday and the lab is shutdown for power maintenance
)

@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

I am able to recreate the failure in ubuntu 16.04 docker env. However, need time to trace the IncomingMessage/Stream/Readable. Currently, in ubuntu16.04, the ReadableState for IncomingMessage is:

ReadableState {
  objectMode: false,
  highWaterMark: 16384,
  buffer: 
   BufferList {
     head: { data: <Buffer 78 32 38 35>, next: [Object] },
     tail: { data: <Buffer 78 35 31 31>, next: null },
     length: 227 },
  length: 908,
  pipes: null,
  pipesCount: 0,
  flowing: true,
  ended: false,
  endEmitted: false,
  reading: true,
  sync: false,
  needReadable: true,
  emittedReadable: false,
  readableListening: false,
  resumeScheduled: false,
  destroyed: false,
  defaultEncoding: 'utf8',
  awaitDrain: 0,
  readingMore: false,
  decoder: null,
  encoding: null }

In OSX, the ReadableState is:

ReadableState {
  objectMode: false,
  highWaterMark: 16384,
  buffer: BufferList { head: null, tail: null, length: 0 },
  length: 0,
  pipes: null,
  pipesCount: 0,
  flowing: true,
  ended: false,
  endEmitted: false,
  reading: false,
  sync: false,
  needReadable: true,
  emittedReadable: false,
  readableListening: false,
  resumeScheduled: false,
  destroyed: false,
  defaultEncoding: 'utf8',
  awaitDrain: 0,
  readingMore: true,
  decoder: null,
  encoding: null }

Need time to figure out why they are different.

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from dc81531 to 24fd5ed Compare May 28, 2017 10:15
@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

@Trott the problem is in some platforms, the data are received after calling request.abort(). Therefore, I added extra checking in Readable.prototype.read to suppress the 'data' event. And all test cases passed in my ubuntu16.04 docker env now.

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from 24fd5ed to 0897a46 Compare May 28, 2017 10:44
@gibfahn
Copy link
Member

gibfahn commented May 28, 2017

CI 2: https://ci.nodejs.org/job/node-test-commit/10206/

EDIT: CI is green

@@ -458,6 +458,9 @@ Readable.prototype.read = function(n) {
endReadable(this);
}

//drop data if _dumped flag exists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

-  //drop data if _dumped flag exists
+  // Drop data if _dumped flag exists.

@gibfahn
Copy link
Member

gibfahn commented May 28, 2017

So at the moment your Git author name and email address are set to:

yhwang <yh.wang@ibm.com>

People usually choose to use their full names for commits. To set your name globally you can do:

git config --global user.name "Yihong Wang"

To change the author for a single commit you can do:

git commit --amend --author="Yihong Wang <yh.wang@ibm.com>"
git push --force-with-lease

If you don't want to have your name against this commit, that's fine too.

@gibfahn
Copy link
Member

gibfahn commented May 28, 2017

cc/ @nodejs/streams

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from 0897a46 to 377817d Compare May 28, 2017 14:55
@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

Thanks @gibfahn I updated the commit and also my name.

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from 377817d to 3714927 Compare May 28, 2017 14:58
@Trott
Copy link
Member

Trott commented May 28, 2017

Nice! Since this has evolved from a test fix to an actual fix for http/streams on some platforms, the commit title (and PR title) should probably be updated to something like: http: suppress data event if req aborted

@yhwang yhwang changed the title test: enable test-http-abort-stream-end http: suppress data event if req aborted May 28, 2017
@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from 3714927 to 5e0bdb7 Compare May 28, 2017 15:10
@@ -458,6 +458,9 @@ Readable.prototype.read = function(n) {
endReadable(this);
}

// Drop data if _dumped flag exists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the _dumped flag only exists in http, maybe mention that in the comment too so that it's clear that this bit of code is explicitly for http requests and what _dumped means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified the comment and saying:

  // Drop data if '_dumped' flag exists. This is for IncomingMessage.
  // When ClientRequest.abort() is called, the '_dumped' flag of IncomingMessage
  // is set to 'true'.  Then ClientRequest shouldn't receive any
  // 'data' event.

@Trott
Copy link
Member

Trott commented May 28, 2017

/cc @nodejs/streams @nodejs/http

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from 5e0bdb7 to 3077166 Compare May 28, 2017 15:18
@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

@Trott thanks for the comments. I updated the commit/pr title and revised the comments in code.

@@ -458,6 +458,12 @@ Readable.prototype.read = function(n) {
endReadable(this);
}

// Drop data if '_dumped' flag exists. This is for IncomingMessage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be nit pick here, but maybe http.IncomingMessage just to make it clear that it's http-specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! and I updated the comments. :-)

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch 2 times, most recently from 5a348e1 to af1ecc5 Compare May 28, 2017 15:29
// When http.ClientRequest.abort() is called, the '_dumped' flag of
// http.IncomingMessage is set to 'true'. Then http.ClientRequest
// shouldn't receive any 'data' event.
if (this._dumped === true) ret = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really -1 with this change, as it introduces more coupling between http and streams. Can you move this in http.IncomingMessage? If it is not possible, we should think about adding some public facing API to support this usecase. Where is this._dumped written?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_dumped is written from here: lib/_http_incoming.js and it's also been used here: lib/_http_common.js. I am also curious about when this flag: _dumped is introduced. Could this flag be replaced by any of the flag in ReadableState? or move this flag into the ReadableState?

@mscdex
Copy link
Contributor

mscdex commented May 28, 2017

If we really need this, couldn't abort() just monkey-patch .emit() to filter 'data' events? We'd need to see what performance impact the dynamic monkey patching would have though (perhaps compared to overriding emit() on the prototype and filtering in there depending on a flag being set)...

Both of those solutions would avoid the need to make changes to stream.Readable.

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch 2 times, most recently from 04a6791 to b62e37a Compare May 28, 2017 21:20
@yhwang
Copy link
Member Author

yhwang commented May 28, 2017

@mcollina Based on your suggestion, I modified my change as following:

  • add an internal API in ReadableState: ·flush(). It flushes the buffered data and set the data length to 0
  • inside http.IncomingMessage._dump(), use the new internal API and avoid the unnecessary 'data' event

@mscdex This also reduce the efforts to handle the buffered data. No need to monkey patching the .emit() in http.IncomingMessage.
Any comments/suggestions?

@@ -315,6 +315,8 @@ IncomingMessage.prototype._dump = function _dump() {
if (!this._dumped) {
this._dumped = true;
this.resume();
// Flush the existing buffered data
this._readableState._flush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this will go over well either since there is a recent movement to avoid reaching directly into _*State stream properties from outside streams.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. However, inside the http.IncomingMessage, it still access _*State in a few spots. Maybe the movement could do them all together later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok if you move the code you put in _flush() here, if it is needed help solve a bug.

At the same time, it's better to open a feature request for a new method in the stream API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the general idea is to at least avoid incorporating new direct uses of _*State. Kicking the can down the road doesn't seem like a good idea IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex I do not want to rush adding a new method to an already complex API to fix a bug either. I'm ok in adding this, just not as a bug fix, as it will need to be reasoned upon for a bit. If this can wait, I'm ok following that path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina and @mscdex Thanks for the comments. If the access of _*State would be removed, I guess, at least there should be new APIs in Readable for those _*State access. @mcollina Should I move the _flush() to Readable. The flush like API may be needed in some other cases. I think this is related to request.abort(). Currently, it doesn't work in some platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just do not remove the 'data' handlers in _dump():

+    this.removeAllListeners('data');
     this.resume();

Copy link
Member Author

@yhwang yhwang May 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina yes, it works. However, it won't work for the code that call req.abort() and followed by a res.on('data', cb);. Without flush the buffered data or explicitly suppress the 'data' event , there is the chance that someone who registers the 'data' event later gets notifications. Personally, I prefer the flush approach. It not only solve the issue but also reduce the efforts to handle the buffered data if we only suppress 'data' event. I turned on the debug trace and found it spent many event loops in handling the buffered data if we only suppress 'data' event. Flush the buffered has better performance.

For removeAllListeners('data'), I am fine with the change if it impacts less.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the probability of someone adding a 'data' handler after calling req.abort() is pretty low.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mscdex and @mcollina. I updated the commit with removeAllListeners('data').

@yhwang
Copy link
Member Author

yhwang commented May 29, 2017

@Trott @gibfahn @mscdex @mcollina any suggestion for the next step? Should we enable the test case and fix the bug or add document to mention that request.abort() doesn't work in some platforms for now? Thanks!

@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from b62e37a to d0bbdfb Compare May 30, 2017 05:19
@gibfahn
Copy link
Member

gibfahn commented May 30, 2017

@mscdex @mcollina does this LGTY now?

@gibfahn
Copy link
Member

gibfahn commented May 30, 2017

@@ -314,6 +314,9 @@ function _addHeaderLine(field, value, dest) {
IncomingMessage.prototype._dump = function _dump() {
if (!this._dumped) {
this._dumped = true;
// If there is buffered data , it may trigger 'data' event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits:
s/data ,/data,/
s/trigger 'data' event/trigger 'data' events/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex updated. Thanks..

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we can merge this semver-minor, but with no backport (we might want to revise that). We also need a CITGM run:
https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/846/

Re-enable test-http-abort-stream-end and put it into parallel
category. Use system random port when calling server.listen()
and fix eslint errors.

After calling request.abort(), in order to avoid the buffered
data to trigger the 'data' event, explicitly remove 'data' event
listeners.
@yhwang yhwang force-pushed the enable-test-http-abort-stream-end branch from d0bbdfb to a3c64e2 Compare May 30, 2017 16:07
@refack
Copy link
Contributor

refack commented May 30, 2017

@yhwang , @mcollina I killed https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/nodes=osx1010/846/ since it was stuck (and also uninformative because of #13305)

@refack
Copy link
Contributor

refack commented May 30, 2017

I've taken the liberty to rerun CITGM rebased on the npm fix: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/848/

@refack
Copy link
Contributor

refack commented May 30, 2017

I've taken the liberty to rerun CITGM rebased on the npm fix: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/848/

CITGM looks clean.

@mcollina
Copy link
Member

@mscdex @Trott @refack may I have another LGTM on this? Are we ok in landing as semver-patch as a bugfix?

@mscdex
Copy link
Contributor

mscdex commented May 31, 2017

LGTM

@mcollina
Copy link
Member

Landed as 716e9e0

@mcollina mcollina closed this May 31, 2017
mcollina pushed a commit that referenced this pull request May 31, 2017
Re-enable test-http-abort-stream-end and put it into parallel
category. Use system random port when calling server.listen()
and fix eslint errors.

After calling request.abort(), in order to avoid the buffered
data to trigger the 'data' event, explicitly remove 'data' event
listeners.

PR-URL: #13260
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@gibfahn
Copy link
Member

gibfahn commented May 31, 2017

@mcollina should this be backported to Node 6?

@mcollina
Copy link
Member

Yes, after the usual period in current, maybe even a bit more.

@gibfahn gibfahn added baking-for-lts PRs that need to wait before landing in a LTS release. lts-watch-v6.x labels May 31, 2017
@Trott
Copy link
Member

Trott commented May 31, 2017

Thanks for the contribution, @yhwang! 🎉

@yhwang
Copy link
Member Author

yhwang commented May 31, 2017

Thank you all and I will open a new feature request in the stream to flush the buffered data. My pleasure to work with you all!

jasnell pushed a commit that referenced this pull request Jun 5, 2017
Re-enable test-http-abort-stream-end and put it into parallel
category. Use system random port when calling server.listen()
and fix eslint errors.

After calling request.abort(), in order to avoid the buffered
data to trigger the 'data' event, explicitly remove 'data' event
listeners.

PR-URL: #13260
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@jasnell jasnell mentioned this pull request Jun 5, 2017
@MylesBorins MylesBorins removed the baking-for-lts PRs that need to wait before landing in a LTS release. label Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants