Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eagerly process input EOF/connection resets (fixes #277) #286

Merged
merged 1 commit into from
Apr 16, 2018

Conversation

weissi
Copy link
Member

@weissi weissi commented Apr 5, 2018

Motivation:

We had a number of problems:

  1. We wanted to lazily process input EOFs and connection resets only
    when the user actually calls read(). On Linux however you cannot
    unsubscribe from EPOLLHUP so that's not possible.
  2. Lazily processing input EOFs/connection resets wastes kernel
    resources and that could potentially lead to a DOS
  3. The very low-level Selector interpreted the eventing mechanism's
    events quite a lot so the EventLoop/Channel only ever saw
    readable or writable without further information what exactly
    happened.
  4. We completely ignored EPOLLHUP until now which on Unix Domain
    Socket close leads to a 100% CPU spin (issue NIO mishandles EPOLLHUP #277)

Modifications:

  • made the Selector interface richer, it now sends the following
    events: readable, writable, readEOF (input EOF), reset
    (connection reset or some error)
  • process input EOFs and connection resets/errors eagerly
  • change all tests which relied on using unconnected and unbound sockets
    to user connected/bound ones as epoll_wait otherwise would keep
    sending us a stream of EPOLLHUPs which would now lead to an eager
    close

Result:

  • most importantly: fixes NIO mishandles EPOLLHUP #277
  • waste less kernel resources (by dealing with input EOFs/connection
    resets eagerly)
  • bring kqueue/epoll more in line

@weissi weissi changed the title [DO NOT MERGE] [WIP] very early preview of fix for #277 eagerly process input EOF/connection resets (fixes #277) Apr 9, 2018
@weissi weissi force-pushed the jw-277 branch 3 times, most recently from 46217d5 to 7f96b84 Compare April 9, 2018 16:33
@weissi weissi mentioned this pull request Apr 9, 2018
@weissi weissi requested review from Lukasa and normanmaurer April 9, 2018 16:37
Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, so these are mostly requests to clean up code that was messy before, but now that we're in the area we should make it nicer.

kill -0 $server_pid
echo -e 'GET /dynamic/write-delay/10000 HTTP/1.1\r\n\r\n' | nc -w1 -U "$socket"
sleep 0.2
stop_server "$token"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test actually "fail" in any particular way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa yes, in case it leaks fds or if it crashes. The old code wouldn't have passed this. I acknowledge this is not very obvious...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi maybe also add a comment explain this ;) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi please address this one ;)

@@ -755,9 +751,36 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore {
}
}

final func readEOF() {
if self.lifecycleManager.isRegistered {
self.safeReregister(interested: self.interestedEvent.subtracting(.readEOF))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment to say why we subtract readEOF (namely, that we want this to be one-shot).


while self.lifecycleManager.isActive {
var didReceiveEOF: Bool = false
self.readable0(didReceiveEOF: &didReceiveEOF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this use an inout instead of a return value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa/@normanmaurer because of the 'blind boolean' problem. We don't have named return values and I didn't really wanted to introduce a new enum just for that. Happy to change to whatever you both think is best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding an enum would be ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm perfectly fine with adding an enum for this. 😄

if self.lifecycleManager.isRegistered {
self.safeReregister(interested: self.interestedEvent.subtracting(.readEOF))

while self.lifecycleManager.isActive {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we ever drop out of the loop due to a change in isActive without hitting didReceiveEOF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa yes, if the users closes in channelRead. Then we currently just discard the remaining ones. Adding a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normanmaurer / @Lukasa is that what we want?

}
}

// this _needs_ to synchronously cause the fd to be unregistered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate in this comment to say why?

default:
// We only use EVFILT_USER, EVFILT_READ and EVFILT_WRITE.
fatalError("unexpected filter \(ev.filter)")
}
if selectorEvent != ._none {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nicer to express this as a guard: guard selectorEvent != .none, let registration = registrations[Int(ev.ident)] else { continue }

@@ -577,3 +618,20 @@ public enum IOEvent {
/// Not interested in any event.
case none
}

struct SelectorEventSet: OptionSet, Equatable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment please. 😄

@@ -122,7 +122,7 @@ final class SocketChannel: BaseSocketChannel<Socket> {
var result = ReadResult.none
for i in 1...maxMessagesPerRead {
guard self.isOpen && !self.inputShutdown else {
return result
throw ChannelError.eof
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a surprising behavioural change: why was it made?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa because I think it was wrong before. We would read EOF but we short-circuited it with something that isn't actually EOF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. :)

@@ -424,7 +432,12 @@ private final class HTTPHandler: ChannelInboundHandler {
}

// First argument is the program path
let arguments = CommandLine.arguments
var arguments = CommandLine.arguments.dropFirst(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this is an attempt to coerce the type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind writing that in a comment just so others can see why that was done?

public func channelRead(ctx: ChannelHandlerContext, data: NIOAny) {
XCTFail("Should not be called as autoRead is false and we did not call read(), but received \(self.unwrapInboundIn(data))")
if !self.seenEOF {
XCTFail("Should not be called before seeing the EOF as autoRead is false and we did not call read(), but received \(self.unwrapInboundIn(data))")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? Surely we expect to see the EOF after the read data, not before it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right :)

@normanmaurer
Copy link
Member

@weissi please ping when we should review again

@BasThomas
Copy link
Contributor

FYI @weissi, if you change the fix issue #277 in the description to fixes #277, GitHub will link the pull request to the issue and automatically close it once this gets merged. :)

@weissi
Copy link
Member Author

weissi commented Apr 10, 2018

@BasThomas oh awesome, thanks. TIL

@weissi weissi force-pushed the jw-277 branch 2 times, most recently from f2513bd to a229eb3 Compare April 10, 2018 15:32
@BasThomas
Copy link
Contributor

@weissi weissi force-pushed the jw-277 branch 2 times, most recently from 3459801 to 9bdfbfb Compare April 11, 2018 17:18
@Lukasa Lukasa added the 🔨 semver/patch No public API change. label Apr 12, 2018
@weissi weissi force-pushed the jw-277 branch 2 times, most recently from 14212d3 to c1bae21 Compare April 13, 2018 10:31
}

var index: Int = 0
for (event, filter) in [(KQueueEventFilterSet.read, EVFILT_READ), (.write, EVFILT_WRITE), (.except, EVFILT_EXCEPT)] {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to @Lukasa & @normanmaurer . Just in case you were as worried as I was if Swift will optimise this array to be on the stack: It does. I wrote a test program:

import Darwin

enum Bla: Int {
    case read = 0xcafe01
    case write = 0xcafe02
    case except = 0xcafe03
}

@inline(never)
func foo() -> (kevent, kevent, kevent) {
    var es: (kevent, kevent, kevent) = (kevent(), kevent(), kevent())
    for (i, e) in [(Bla.read, EVFILT_READ), (.write, EVFILT_WRITE), (.except, EVFILT_EXCEPT)] {
        var ex: kevent = kevent()
        ex.filter = Int16(e)
        ex.flags = i == .read ? 0 : 1
        ex.fflags = 0xdeadbe
        ex.data = i.rawValue
        if i == .read {
            es.0 = ex
        }
        if i == .write {
            es.1 = ex
        }
        if i == .except {
            es.2 = ex
        }
    }
    return es
}

print(foo())

which gets really nice assembly:

screen shot 2018-04-13 at 11 40 38 am

so the loop gets fully unrolled etc, awesome job Swift compiler!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same seems to happen in the real code. Couldn't spot any calls (except to initStackObject) or any loops

@weissi
Copy link
Member Author

weissi commented Apr 13, 2018

btw:

master is

info: 1000_reqs_1_conn: allocations not freed: 0
info: 1000_reqs_1_conn: total number of mallocs: 285416
info: 1_reqs_1000_conn: allocations not freed: 0
info: 1_reqs_1000_conn: total number of mallocs: 1703361

jw-277 is

info: 1000_reqs_1_conn: allocations not freed: 0
info: 1000_reqs_1_conn: total number of mallocs: 285399
info: 1_reqs_1000_conn: allocations not freed: 0
info: 1_reqs_1000_conn: total number of mallocs: 1687359

which suggests that we lost 6 allocations per connection (pair), ie. three allocs per connection. I cannot claim that I know where they came from but I guess it's good we didn't regress :)

Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, really like this so far! Some notes inline.

// we could both be registered & active (if our channel support half-closure) or unregistered & inactive (if it doesn't).
break loop
case .error:
// we should be unregistered and inactive now (as `readable0` would've called close.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: unbalanced parenthesis.

@@ -774,9 +820,11 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore {
do {
try readFromSocket()
} catch let err {
let isEOF: Bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to just set this to a return code?

}
}

if ev.contains(.read) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra space in here.

}

if ev.contains(.readEOF) {
channel.readEOF()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do read or readEOF? Seems like we shouldn't need to do both.

@@ -41,6 +41,192 @@ private extension Optional {
}
}

/// Represents IO events NIO might be interested in. `SelectorEventSet` is used for two purposed: To express interest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: purposes.

}
if ev.events & Epoll.EPOLLHUP.rawValue != 0 || ev.events & Epoll.EPOLLERR.rawValue != 0 {
selectorEvent.formUnion(.reset)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though actually I might think it should be an extension containing an initializer for SelectorEventSet, actually.

if Int32(ev.flags) & EV_EOF != 0 {
selectorEvent.formUnion(.readEOF)
}
if ev.fflags != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bit over broad?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think so, AFAIK matches EPOLLERR

try body((SelectorEvent(io: .write, registration: registration)))
selectorEvent.formUnion(.write)
if ev.fflags != 0 {
selectorEvent.formUnion(.reset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, is this over broad?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

guard selectorEvent != .none, let registration = registrations[Int(ev.ident)] else {
continue
}
if !registration.interested.contains(.readEOF) && selectorEvent.contains(.readEOF) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here explaining why we do this would be nice.

@@ -564,6 +697,7 @@ enum SelectorStrategy {
}

/// The IO for which we want to be notified.
@available(*, deprecated, message: "IOEvent was made public by accident, is no longer removed internally and will be removed with SwiftNIO 2.0.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "is no longer used internally"

switch self.readable0() {
case .eof:
// on EOF we stop the loop and we're done with our processing for `readEOF`.
// we could both be registered & active (if our channel support half-closure) or unregistered & inactive (if it doesn't).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/support/supports/ ?

break loop
case .normal:
// normal, note that there is no guarantee we're still active (as the user might have closed in callout)
continue loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you could just make this break

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normanmaurer but then the code would be really hard to read, no? continue loop is really clear, so is break loop. But break meaning 'break the switch & continue the loop' is incredibly hard to spot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly I dont think its harder to read but I dont care to much about if you change it or not... Thats why its a nit ;)

@@ -385,13 +391,20 @@ public final class ClientBootstrap {
let channel = try SocketChannel(eventLoop: eventLoop as! SelectableEventLoop, protocolFamily: protocolFamily)

channelInitializer(channel).then {
channelOptions.applyAll(channel: channel)
return channelOptions.applyAll(channel: channel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need the return ?

}.map {
channel
return channel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the return is needed ?

@@ -41,6 +41,192 @@ private extension Optional {
}
}

/// Represents IO events NIO might be interested in. `SelectorEventSet` is used for two purposed: To express interest
/// in a given event set and for notifications about an IO event set that has occured.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider making the two different purposes bullet-pointers.


typealias RawValue = UInt8

let rawValue: UInt8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be let rawValue: RawValue ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the same but changing it

private struct KQueueEventFilterSet: OptionSet, Equatable {
typealias RawValue = UInt8

let rawValue: UInt8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be let rawValue: RawValue ?

private struct EpollFilterSet: OptionSet, Equatable {
typealias RawValue = UInt8

let rawValue: UInt8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be let rawValue: RawValue ?

/// - `output` corresponds to `EPOLLOUT`
/// - `error` corresponds to `EPOLLERR`
private struct EpollFilterSet: OptionSet, Equatable {
typealias RawValue = UInt8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to move the typealias out of the struct as you define the same multiple times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I can't, OptionSet requires it. I could remove that line entirely and rely on the inference but there'll be changes in that area so let's be explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi got it...

do {
try clientChannel.writeAndFlush(buffer).wait()
} catch {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment why its expected that we may see an error ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't thanks

@weissi weissi force-pushed the jw-277 branch 4 times, most recently from dac57f7 to db2594c Compare April 13, 2018 17:59
Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm happy here, well done!

@Lukasa
Copy link
Contributor

Lukasa commented Apr 16, 2018

(Except the tests are broken obviously)

@weissi
Copy link
Member Author

weissi commented Apr 16, 2018

thanks @Lukasa hoping for #314 :)

return channel.register().then {
assert(channel.eventLoop.inEventLoop)
return body(channel)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi consider making a utility function for the register / body execution and also use it for register / bind above. Better to keep all the "special" code in one place

var filter: UInt32 = 0
let epollFilters = EpollFilterSet(selectorEventSet: self)
if epollFilters.contains(.error) {
filter |= Epoll.EPOLLERR.rawValue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi just a note here... You always are registered to EPOLLERR so no need to do this imho:

http://man7.org/linux/man-pages/man2/epoll_ctl.2.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normanmaurer yes I know we're always registered but I thought it would be nice to still do it explicitly. But I'll replace it by a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh hang on, sorry @normanmaurer . Wait, you're always registered for EPOLLERR ? I thought it's only EPOLLHUP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, I'll remove it

Copy link
Member

@normanmaurer normanmaurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits... looks great otherwise. Good work

for _ in 0..<20 {
XCTAssertNoThrow(try clientChannel.writeAndFlush(buffer).wait())
}
try clientChannel.close().wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XCTAssertNoThrow ?


// Wait for 100 ms.
usleep(100 * 1000);
try serverChannel.close().wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XCTAssertNoThrow

try clientChannel.writeAndFlush(buf).wait()
try clientChannel.writeAndFlush(buf).wait()
try clientChannel.close().wait()
try allDone.futureResult.wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use XCTAssertNoThrow here ?

@weissi weissi force-pushed the jw-277 branch 2 times, most recently from d06a5c1 to 20d7b83 Compare April 16, 2018 13:59
private extension Channel {
func registerAndDoSynchronously(_ body: @escaping (Channel) -> EventLoopFuture<Void>) -> EventLoopFuture<Void> {
// this is pretty delicate at the moment:
// `body` (which will `connect`) must be _synchronously_ follow `register`, otherwise in our current
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi nit: the comment is not really correct anymore as it is not only for connect.

// `body` (which will `connect`) must be _synchronously_ follow `register`, otherwise in our current
// implementation, `epoll` will send us `EPOLLHUP`. To have it run synchronously, we need to invoke the
// `then` on the eventloop that the
// `register` will succeed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: merge with previous line

Motivation:

We had a number of problems:

1. We wanted to lazily process input EOFs and connection resets only
   when the user actually calls `read()`. On Linux however you cannot
   unsubscribe from `EPOLLHUP` so that's not possible.
2. Lazily processing input EOFs/connection resets wastes kernel
   resources and that could potentially lead to a DOS
3. The very low-level `Selector` interpreted the eventing mechanism's
   events quite a lot so the `EventLoop`/`Channel` only ever saw
   `readable` or `writable` without further information what exactly
   happened.
4. We completely ignored `EPOLLHUP` until now which on Unix Domain
   Socket close leads to a 100% CPU spin (issue apple#277)

Modifications:

- made the `Selector` interface richer, it now sends the following
  events: `readable`, `writable`, `readEOF` (input EOF), `reset`
  (connection reset or some error)
- process input EOFs and connection resets/errors eagerly
- change all tests which relied on using unconnected and unbound sockets
  to user connected/bound ones as `epoll_wait` otherwise would keep
  sending us a stream of `EPOLLHUP`s which would now lead to an eager
  close

Result:

- most importantly: fix issue apple#277
- waste less kernel resources (by dealing with input EOFs/connection
  resets eagerly)
- bring kqueue/epoll more in line
@normanmaurer normanmaurer added this to the 1.5.0 milestone Apr 16, 2018
@normanmaurer normanmaurer merged commit b109389 into apple:master Apr 16, 2018
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 17, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 17, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 17, 2018
Motivation:

PR apple#286 started processing EPOLLHUPs. Sadly epoll also immediately
throws EPOLLHUP at you if you register a socket that isn't either
connected or listening. This fixes two more instances of this problem.

Modifications:

made sure sockets that are registered are either listening or
connected (and before we let the Selector wait for events)

Result:

tests should be more reliable
Lukasa pushed a commit that referenced this pull request Apr 18, 2018
Motivation:

PR #286 started processing EPOLLHUPs. Sadly epoll also immediately
throws EPOLLHUP at you if you register a socket that isn't either
connected or listening. This fixes two more instances of this problem.

Modifications:

made sure sockets that are registered are either listening or
connected (and before we let the Selector wait for events)

Result:

tests should be more reliable
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit to weissi/swift-nio that referenced this pull request Apr 18, 2018
Motivation:

Since apple#286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
weissi added a commit that referenced this pull request Apr 18, 2018
Motivation:

Since #286, we do eagerly detect connection resets which is good. To
resolve the situation we first try to read everything off the socket. If
something fails there, everything is good and the client gets the
correct error. If that however doesn't lead to any errors, we would
close the connection with `error: ChannelError.eof` which the client
would then see for example on a connect that happened to get
`ECONNRESET` or `ECONNREFUSED`. That's not right so these situations are
fixed here.

Modifications:

- on reset, try to get the socket error if any
- if no socket error was detected, we send a `ECONNRESET` because we
  know we have been reset but not why and didn't encounter any errors
- write tests for these situations

Result:

connection resets that don't hit any other errors will now see the
correct error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NIO mishandles EPOLLHUP
4 participants