Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill all pending accepts when TCP listener is closed #1517

Merged
merged 3 commits into from
Jan 16, 2019

Conversation

kevinkassimo
Copy link
Contributor

@kevinkassimo kevinkassimo commented Jan 14, 2019

Closes #1516 .

Keep track of the pending accept task and notify it when the corresponding TcpListener from ResourceTable is dropped.

Before: Hangs forever
After:

Other: Listener has been closed
    at DenoError (deno/js/errors.ts:19:5)
    at maybeError (deno/js/errors.ts:38:12)
    at handleAsyncMsgFromRust (deno/js/dispatch.ts:27:17)

and dies. This error could be caught.

This is a first-pass implementation (so might not be the best solution). I'm slightly worried about the performance implication of adding 2 ResourceTable locks.
(Benchmark from Travis seems no change though)

r.track_task(self.task_id);
return Ok(futures::prelude::Async::NotReady);
}
Err(e) => return Err(From::from(e)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should untrack here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

err = e;
}
assert(!!err);
assertEqual(err.message, "Listener has been closed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - good test case

src/resources.rs Outdated
// we need to track these task so that when the listener is closed,
// pending tasks could be notified and die.
// The tasks are indexed, so they could be removed when accept completed.
TcpListener(tokio::net::TcpListener, HashMap<usize, futures::task::Task>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is something TcpListener should take care of? Is this working around a bug in tokio? Maybe we ought to report and link here?

Copy link
Contributor Author

@kevinkassimo kevinkassimo Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so sure about this... we could probably ask them...
(Also I think our usage here is different from its most publicly available API, so there might be a miss...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind opening an issue and just describing what's we're experiencing https://github.com/tokio-rs/tokio
Ideally link that issue in the source code here.
It's also unclear to me what the expected behavior is.

Copy link
Contributor Author

@kevinkassimo kevinkassimo Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted: tokio-rs/tokio#846
I'll add a comment linking to this once there are some basic feedback from the Tokio team (that this is indeed a problem)

Copy link
Contributor Author

@kevinkassimo kevinkassimo Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe based on the discussion there we might just save the task and notify ourselves...

There seems to be some problems with sccache on AppVeyor... On Travis it seems that it is rejecting more network listening requests (tried locally on my Mac machine, and seems it is the firewall that rejects the listen after reaching a threshold and instead shows a permission prompt that requires manual confirmation...) Nvm, found the issue

src/resources.rs Outdated
// If TcpListener, we must kill all pending accepts!
if let Repr::TcpListener(l, m) = r.unwrap() {
// Drop first
std::mem::drop(l);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think this is necessary - it will be dropped with the above remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed... Forgot to remove this (was part of initial experiment...)

use std::sync::atomic::Ordering;
lazy_static! {
// Keep unique such that no collisions in TcpListener accept task map
static ref NEXT_ACCEPT_ID: AtomicUsize = AtomicUsize::new(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a HashSet instead and avoid this machinery?

Copy link
Contributor Author

@kevinkassimo kevinkassimo Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that futures::task::Task does not implement a few traits necessary for comparison in HashSet...

@kevinkassimo
Copy link
Contributor Author

kevinkassimo commented Jan 15, 2019

Per tokio-rs/tokio#846 (comment)
I believe we should throw an error when multiple concurrent accept on the same listener is called. With such enforcement, we could simply keep an Option instead of a HashMap for the current underlying task and only need to notify this single task on listener drop.

I'll go and implement this if this sounds reasonable

@ry
Copy link
Member

ry commented Jan 15, 2019

@kevinkassimo Thanks for investigating this. Yes, one accept at a time seems like a reasonable constraint for a single thread of JS. I can imagine that we might want to share servers among isolates in the future and accept in multiple workers... but let's do that later. You should mention in a comment something about how this may need to be improved for the multi-worker accept situation.

@kevinkassimo
Copy link
Contributor Author

Should be ready for one more round of review. (Added a test for multiple concurrent accepts)

js/net_test.ts Outdated Show resolved Hide resolved
src/resources.rs Outdated Show resolved Hide resolved
js/net_test.ts Outdated Show resolved Hide resolved
js/net_test.ts Show resolved Hide resolved
src/resources.rs Show resolved Hide resolved
src/tokio_util.rs Outdated Show resolved Hide resolved
src/tokio_util.rs Outdated Show resolved Hide resolved
Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - very technical fix - thanks for digging through it.

@ry ry merged commit 431e455 into denoland:master Jan 16, 2019
@ry
Copy link
Member

ry commented Jan 17, 2019

@kevinkassimo FYI I've now seen one failure of this test on Windows. Not sure what's happening - it's potentially flaky.

test netConcurrentAccept_permW0N1E0R0 assertEqual failed. actual = Another accept task is ongoing expected = Listener has been closedC:/deno/js/deps/https/deno.land/x/std/testing/mod.ts:23:14
        throw new Error(msg);
Error: actual: Another accept task is ongoing expected: Listener has been closed
    at assertEqual (file:///C:/deno/js/deps/https/deno.land/x/std/testing/mod.ts:28:11)
    at listener.accept.catch.e (file:///C:/deno/js/net_test.ts:33:5)
127.0.0.1 - - [17/Jan/2019 22:07:43] "GET /tests/subdir/mt_video_mp2t.t3.ts HTTP/1.1" 200 -
127.0.0.1 - - [17/Jan/2019 22:07:43] "GET /tests/subdir/no_ext HTTP/1.1" 200 -
``

@kevinkassimo
Copy link
Contributor Author

@ry It's a race condition I believe: basically the second accept somehow polls earlier than the first one by the scheduler.

ry added a commit to ry/deno that referenced this pull request Apr 28, 2019
…ener is closed (denoland#2224)"

Crashes while running wrk against
js/deps/https/deno.land/std/http/http_bench.ts

This reverts commit 972ac03.
ry added a commit that referenced this pull request Apr 28, 2019
…closed (#2224)" (#2239)

Crashes while running wrk against
js/deps/https/deno.land/std/http/http_bench.ts

This reverts commit 972ac03.
keroxp added a commit to keroxp/deno that referenced this pull request May 2, 2019
ry pushed a commit to keroxp/deno that referenced this pull request May 2, 2019
ry pushed a commit to keroxp/deno that referenced this pull request May 3, 2019
@kevinkassimo kevinkassimo deleted the tcp/close_accept branch December 27, 2019 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Closing TCP listener while accepting causes causes hanging
4 participants