Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add native Windows back-end #166

Closed
wants to merge 109 commits into from
Closed

[WIP] Add native Windows back-end #166

wants to merge 109 commits into from

Conversation

antrik
Copy link
Contributor

@antrik antrik commented Jun 2, 2017

This picks up from #108

For now, it's just commit refactoring (squashing/splitting) of the original PR -- though the intention is to add a series of follow-up commits next, doing some improvements to the actual code, according to my latest review of the original PR.

Since the implementation is essentially working at this point though, it could also be merged as is, if that turns out necessary.

@highfive
Copy link
Collaborator

highfive commented Jun 2, 2017

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!

@antrik
Copy link
Contributor Author

antrik commented Jun 2, 2017

Note: the Linux test failure is a known intermittent in the test condition (happening under memory pressure), not in the actual ipc-channel implementation -- it is on my list to fix when I get around to it...

@antrik
Copy link
Contributor Author

antrik commented Jun 2, 2017

Dropped truncate() calls from the newly introduced tests -- missed that during the rebase...

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #167) made this pull request unmergeable. Please resolve the merge conflicts.

@antrik antrik force-pushed the windows branch 2 times, most recently from 29a368b to 60919ae Compare August 25, 2017 20:59
@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #168) made this pull request unmergeable. Please resolve the merge conflicts.

@antrik antrik force-pushed the windows branch 4 times, most recently from e62d6d8 to 3134354 Compare September 23, 2017 22:52
@nox
Copy link
Contributor

nox commented Sep 24, 2017

What's the status on this?

@antrik
Copy link
Contributor Author

antrik commented Sep 24, 2017

@nox every time I get to work on the issues I already know, I find other bits to improve as well... So it's hard to give an estimate. If I don't run into any serious problems, it could be done in a few days -- but I wouldn't count on that...

@antrik antrik force-pushed the windows branch 4 times, most recently from cbe49f7 to 4cda562 Compare September 24, 2017 13:52
@nox
Copy link
Contributor

nox commented Sep 24, 2017

No problem @antrik, was just doing a bit of triaging.

@antrik antrik force-pushed the windows branch 6 times, most recently from 82db31d to 99f2c1a Compare September 24, 2017 18:49
antrik added 14 commits June 8, 2018 21:54
This field is only meaningful while an async read is in progress with
the kernel; and we never reuse it between reads. Making this more
explicit and robust by only giving it a value while it's in use.
Since the `ov` field is now only set when we have a read in progress,
and thus effectively serves as an indicator of that state, the explicit
flag became redundant: we can just check the presence of `ov` instead.

(And since a check is performed implicitly when unwrapping the `ov`
value, some of the explicit asserts on `read_in_progress` can be dropped
entirely.)
Introduce an `AliasedCell` wrapper type for the fields that get aliased
by the kernel during async read operations, making sure that these
values can only be accessed through special unsafe methods, i.e. only
inside `unsafe` blocks, even if the wrappers get passed around through
safe code.

This makes invoking `MessageReader.start_read()` safe; and consequently
removes all unsafety from `OsIpcReceiver`.

(`OsIpcReceiverSet.select()` retains some unsafe code though, since it
does raw I/O itself... This can only be fixed by factoring out that
code.)
Put both of the fields aliased by the kernel during an async read
operation together in a common `AliasedCell<>`. This increases
robustness, and further tightens unsafe boundaries, by making sure the
two fields stay consistent with each other.

When they are wrapped in `AliasedCell<>` separately, safe code is
prevented from giving any of them invalid values individually -- but
they could still get out of sync with each other, if some code moves
just one of them and not the other. Consistency between these fields
however is crucial for correctness as well as soundness.
While this might have been a problem in the past, the current code
properly passes errors from `notify_completion()` up through all layers;
so there is nothing really preventing us from orderly returning any kind
of error reported by the `GetOverlappedResult()` or
`GetQueuedCompletionStatus()` system calls.

(The only problem would be if `GetOverlappedResult()` has failure modes
that actually leave the async operation in progress, and thus the async
data in use: in that case, unpacking the `AliasedCell<>` in async and
returning to the caller would be wrong... But if that's the case, the
previous behaviour of panicking after unpacking the `AliasedCell<>` was
just as wrong.)

Since returning arbitrary errors requires us to invoke
`WinError::from_system()` (either directly, or indirectly through
`WinError::last()`), and this one wants to know the origin of the error
(so it can put it in debug traces), we need to do these conversions near
the call sites, and pass an already converted error to
`notify_completion()`. (Which seems cleaner anyway.)

This has the side effect of also logging "broken pipe" (sender closed)
errors in the debug trace, which might be slightly redundant with any
other tracing done by the "closed" handling... I don't think that's
really a problem, though.
Split out the system call and associated handling for getting completion
notifications on an IOCP (set) from the rest of the `select()` method,
similar to how `fetch_async_result()` encapsulates the event handling
for regular readers.

This will be necessary for proper `Drop` handling for
`OsIpcReceiverSet`.

Incidentally, this exactly covers tha unsafe code section of the
`select()` implementation; and as such, it's a good step towards better
layering of the IOCP handling in general.

I guess it could be argued that a method call is also more readable than
assigning from a large anonymous block...
According to the documentation of `CancelIoEx()`, successful completion
of the `CancelIoEx()` call only indicates that the cancel request has
been successfully *queued* -- but it does *not* mean we can safely free
the aliased buffers yet! Rather, we have to wait for a notification
signalling the completion of the async operation itself.

We thus split out the actual `CancelIoEx()` call into a new
`issue_async_cancel()` method, and turn `cancel_io()` into a wrapper
that waits for the actualy async read to conclude (using
`fetch_async_result()`) after issuing the cancel request.

Since that doesn't work on readers in a receiver set, we need to add an
explicit `Drop` implementation for `OsIpcReceiverSet`, which issues
cancel requests for all outstanding read operations, and then uses
`fetch_iocp_result()` to wait for all of them to conclude.
Increase chance of catching intermittent failures while working on other
stuff...
Make sure all code related to the `win32-trace` feature is compiled only
when the feature is enabled.

This avoids unnecessary bloat; as well as potential compile errors when
other code conditional on this feature is added.
Add a (conditional) helper method for obtaining the raw handle of the
reader -- which is often needed for the `win32_trace` invocations -- to
abstract the internal structure of this type, thus facilitating further
refactoring.

An alternate approach would be overriding the `Debug` trait on
`MessageReader`, to just print the raw handle value. That would provide
better encapsulation; however, it would also preclude the possibility of
easily printing all the constituents of the structure during
debugging...

(Or we could leave `Debug` alone, and instead implement it as `Display`
-- but that feels like an abuse of the `Display` facility... Not sure
what to think about that.)
For the duration of an async read operation, move the pipe handle into
the `AliasedCell` along with the other fields used for the async
operation.

This prevents anything else from messing with the pipe while the async
read is in progress; and makes sure the handle and the other fields can
never get mismatched. While I'm not sure whether there is any scenario
in which such a mismatch could result in undefined behaviour, it's good
for general robustness in any case.
@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #183) made this pull request unmergeable. Please resolve the merge conflicts.

@ebkalderon
Copy link

Anyone know the current status of this PR? Is no one working on it anymore?

@SamuelMarks
Copy link

How's this progressing?

@ofek
Copy link

ofek commented Dec 24, 2019

Any update?

@jdm
Copy link
Member

jdm commented Dec 24, 2019

#233 has the latest updates.

@jdm
Copy link
Member

jdm commented Mar 12, 2021

Fixed in #272

@jdm jdm closed this Mar 12, 2021
@SamuelMarks
Copy link

Great to see @jdm

PS: Might be worth updating the top-level README.md for this repo also

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.