Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m(windows): Reimplement Wepoll in Rust #88

Merged
merged 8 commits into from
Mar 6, 2023
Merged

Conversation

notgull
Copy link
Member

@notgull notgull commented Feb 16, 2023

Thus far, wepoll has served us well in terms of polling on Windows. However, at this stage, it's time to migrate to our own home-rolled implementation. There is at least one critical soundness bug that stems from our usage of wepoll (#85), and it would be significantly easier to compile polling (as well as all of its dependents) if we purged our only C dependency. In addition, there are some issues that we can only solve if we move away from wepoll (smol-rs/async-io#17, smol-rs/async-io#25, #82).

The new strategy is similar to what wepoll currently does, except (hopefully) more robust. The core consists of using an I/O Completion Port as the core of the system, and then using the undocumented Auxillary Function Driver to deliver completion packets whenever a socket is considered "ready". There is a state object associated with each socket that contains the events currently being polled, the in-flight events associated with the socket, and whether the socket state is queued for deletion.

The differences between this implementation and wepoll includes:

  • Rather than using a red-black tree to store socket state, we use a HashMap.
  • A bounded ConcurrentQueue is used to hold pending socket updates, rather than a naive queue implementation. This queue has a capacity of 1024 elements and every update is run run the queue is full. In contrast, wepoll's queue is unbounded.
  • Packets have Pin guarantees, in order to prevent state captured by the I/O completion port from being moved.
  • Instead of going through the epoll_event intermediary, we directly convert completion packets to Events.
  • We directly support notifying the Poller through a designated wakeup packet.
  • We only expect one user to be polling at a time, so we use an AtomicBool instead of an atomic number.

Closes #22 and closes #85. Supersedes #76.

@notgull notgull marked this pull request as ready for review February 17, 2023 00:03
@notgull
Copy link
Member Author

notgull commented Feb 17, 2023

I think this is stable enough for review now; the tests aren't catching any bugs that I can see (although the precision tests are taking a while? But that might just be part of the test).

@notgull notgull requested a review from taiki-e February 17, 2023 14:11
Implement polling for Windows

Reimplement wepoll in Rust

Fix immediately obvious bugs

Fix AFD event translation

Fix MSRV build errors

Make sure to unlock packets after retrieval

Lock sources list with rwlock instead of mutex
Copy link
Collaborator

@taiki-e taiki-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have not reviewed the details regarding the Windows API, but at first glance, this looks great.

Also, we may want to test some downstream crates before merging/releasing this patch because such a rewrite may cause subtle differences in behavior. (in the similar way that our CI tests async-io)

src/iocp/afd.rs Show resolved Hide resolved
src/iocp/afd.rs Show resolved Hide resolved
@notgull
Copy link
Member Author

notgull commented Feb 23, 2023

Also, we may want to test some downstream crates before merging/releasing this patch because such a rewrite may cause subtle differences in behavior.

I just ran a small "crater test" on a few crates that have polling in their dependency tree. I ran cargo test for:

  • async-io
  • async-net
  • async-std
  • isahc
  • libp2p (had to figure out how to set up protoc on Windows for this 😮‍💨)
  • quinn (tests mostly use tokio but the ones I modified to use async-std seemed to turn out OK)
  • zbus

All of the tests seem to pass (or at least the ones that weren't already failing on Windows). I think we're fine in terms of potentially undefined behavior.

Edit: I included this excerpt in their Cargo.tomls:

[patch.crates-io]
polling = { git = 'https://github.com/smol-rs/polling', branch = 'notgull/reimplement-wepoll' }

@notgull notgull requested a review from taiki-e February 24, 2023 22:24
@taiki-e taiki-e requested a review from fogti March 5, 2023 10:20
src/iocp/port.rs Show resolved Hide resolved
unsafe impl<T: Completion> CompletionHandle for Pin<Arc<T>> {
type Completion = T;

fn get(&self) -> Pin<&Self::Completion> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the pointer that this function returns pinned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pinning ensures that the overlapped object isn't moved while it's involved in an operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how could it be moved when it is behind an & reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's being used in the overlapped operation, there is no reference (or at least, none that the borrow checker is aware of). As a safeguard against undefined behavior, this ensures that it isn't invalidated during the operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, seems a bit brittle, but I'll go with it for now.

src/iocp/afd.rs Show resolved Hide resolved
src/iocp/afd.rs Show resolved Hide resolved
src/iocp/afd.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
src/iocp/mod.rs Show resolved Hide resolved
src/iocp/mod.rs Outdated Show resolved Hide resolved
@fogti
Copy link
Member

fogti commented Mar 6, 2023

Thanks for the work btw. Looks impressive.

@notgull
Copy link
Member Author

notgull commented Mar 6, 2023

Thanks for the work btw. Looks impressive.

Thanks for the review! I know this is a monster PR, so I appreciate you taking the time to sit down and look through it all. I'll merge once the CI passes.

@fogti
Copy link
Member

fogti commented Mar 6, 2023

It might be a good idea to do another downstream test, otherwise this looks ok. I think it is appropriate to squash-merge this when done. The possible brittleness of <Pin<Arc<T>> as CompletionHandle>::get() (or maybe the whole CompletionHandle + OVERLAPPED API) should imo be mentioned/appended to the intro post in the resulting merge commit.
"run run" in the intro post looks like a mistake btw.

We only expect one user to be polling at a time, so we use an AtomicBool instead of an atomic number.

Can/do we enforce that?

@notgull
Copy link
Member Author

notgull commented Mar 6, 2023

We only expect one user to be polling at a time, so we use an AtomicBool instead of an atomic number.

Can/do we enforce that?

A Mutex in the top-level Poller is locked before any polling is done.

polling/src/lib.rs

Lines 491 to 505 in e85331c

if let Ok(mut lock) = self.events.try_lock() {
// Wait for I/O events.
self.poller.wait(&mut lock, timeout)?;
// Clear the notification, if any.
self.notified.swap(false, Ordering::SeqCst);
// Collect events.
let len = events.len();
events.extend(lock.iter().filter(|ev| ev.key != usize::MAX));
Ok(events.len() - len)
} else {
log::trace!("wait: skipping because another thread is already waiting on I/O");
Ok(0)
}

@notgull notgull merged commit 24900fb into master Mar 6, 2023
@notgull notgull deleted the notgull/reimplement-wepoll branch March 6, 2023 00:25
notgull added a commit that referenced this pull request Mar 6, 2023
Reimplements the C-based wepoll backend in Rust, using some handwritten code. This PR also implements bindings to the I/O Completion Ports and \Device\Afd APIs. For more information on the latter, see my blog post on the subject: https://notgull.github.io/device-afd/

Note that the IOCP API is wrapped using a `Pin`-oriented "CompletionHandle" system that is relatively brittle. This should be replaced with a better model when one becomes available.
@notgull notgull mentioned this pull request Mar 6, 2023
notgull added a commit that referenced this pull request Mar 8, 2023
Making sure that this works on Windows after #88
@notgull notgull mentioned this pull request Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Null pointer dereference in patched wepoll code Removing the wepoll dependency?
3 participants