Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace trilogy_sock_discard and shutdown with close #111

Merged
merged 4 commits into from
Aug 15, 2023

Conversation

jhawthorn
Copy link
Member

This replaces the implementation of _cb_raw_shutdown with one that calls close (instead of shutdown) and replaces the various callbacks with stubs which perform no action.

This should be functionally nearly identical, but is nice because we fewer system calls to consider (we call close once and that's it), we don't have to rely on the assumption that after calling shutdown any further reads/writes will cause an EPIPE (an assumption which should hold true, but it's much simpler to just do what we want). Another advantage is that we'll immediately close the file descriptor.

This also replaces trilogy_sock_discard with just a call to the shutdown callback. The purpose of "discard" is to prevent further communication through the socket (and then close it), the same purpose of shutdown. I think this would have worked previously with the shutdown-based shutdown callback, but that may have caused errors. This is slightly cleaner as we avoid open and dup2 syscalls which could fail (and how the caller should clean up in that case is unclear).

I'd like it if we found a way to combine the close callback with this, but currently the close callback is more of a free, so I'll leave that for a future refactor.

sock_discard and shutdown have the same purpose: preventing further
communication through the socket.
@jhawthorn jhawthorn changed the title Repalce trilogy_sock_discard and shutdown with close Replace trilogy_sock_discard and shutdown with close Aug 4, 2023
Copy link
Collaborator

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make sense to me, but I'll let @casperisfine weigh in too since he authored the original PR around discard!, and IIRC he had concerns about using a raw close().

I think the concern before was accidentally writing to the closed fd though, and with the I/O operations now stubbed, seems like it should be fine?

sock->base.fd_cb = _cb_raw_fd;

if (sock->fd != -1)
close(sock->fd);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check the rc of close() here and raise a TRILOGY_SYSERR if the close was unsuccessful?

Copy link
Member Author

@jhawthorn jhawthorn Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't think we should (but could see an argument). The specification for close is slightly a mess, but in practice pretty straightforward 😅. From the man page

A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close() that releases the open file description. Failing to check the return value when closing a file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.

Note, however, that a failure return should be used only for diagnostic purposes (i.e., a warning to the application that there may still be I/O pending or there may have been failed I/O) or remedial purposes (e.g., writing the file once more or creating a backup).

Reporting a delayed failure from a previous write I don't think is helpful to us. In the cases we want to use shutdown (after an error or after fork) we just want it to close.

(also should note that I believe this is referring to pending I/O inside the kernel, this will not be writing any data that could be duplicated between processes)

It continues:

Retrying the close() after a failure return is the wrong thing to do, since this may cause a reused file descriptor from another thread to be closed. This can occur because the Linux kernel always releases the file descriptor early in the close operation, freeing it for reuse; the steps that may return an error, such as flushing data to the filesystem or device, occur only later in the close operation.

So on Linux, close always closes the file descriptor (it sounds like there are some quirks where this may not happen on HP-UX, but that's sort of unsolvable and the most portable choice is to assume close always succeeds at closing). That's a property I think is really nice for us to have as we can assume close always succeeds and so shutdown and discard always succeed (which is great because there's nothing we could do to deal with it failing).

We could return the SYSERR, it looks like that's what we do from _cb_raw_close, but we also never use the result from _cb_raw_close and I don't think we should. There's no helpful action to be taken because we have succeeded in closing the socket.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense -- thanks for elaborating! ❤️

@@ -623,28 +663,3 @@ int trilogy_sock_upgrade_ssl(trilogy_sock_t *_sock)
sock->ssl = NULL;
return TRILOGY_OPENSSL_ERR;
}

int trilogy_sock_discard(trilogy_sock_t *_sock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear the dup2 dance may still be necessary.

If the socket wasn't fully flushed, calling close() on it may send a RST packet to the server.

So in a forking situation, if that happens in the child (because you discard your inherited connections), then the parent connection will be reset.

Of course all this is a bit handwavy, and if the connection was properly flushed it shouldn't happen, but...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's the case. If the socket is still open in the parent process, the close in the child should just be decrementing a counter in the kernel. The kernel will only send a RST or FIN when the last copy is closed.

As far as I can see dup2 is equivalent to calling close on the second argument.

macOS says:

if descriptor fildes2 is already in use, it is first deallocated as if a close(2) call had been done first.

Linux man-pages 6.04 says:

If the file descriptor newfd was previously open, it is closed before being reused; the close is performed silently (i.e., any errors during the close are not re‐ported by dup2()).

The steps of closing and reusing the file descriptor newfd are performed atomically. This is important, because trying to implement equivalent functionality using close(2) and dup() would be subject to race conditions, whereby newfd might be reused between the two steps.

I think calling close has identical behaviour (from the remote's perspective) to what we did previously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think you are right. 👍

@jhawthorn jhawthorn merged commit ca48f58 into trilogy-libraries:main Aug 15, 2023
@jhawthorn jhawthorn deleted the close_to_shutdown branch August 15, 2023 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants