Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventLoop 2.0 swallowing events #865

Closed
chrisduerr opened this issue May 13, 2019 · 12 comments · Fixed by #982
Closed

EventLoop 2.0 swallowing events #865

chrisduerr opened this issue May 13, 2019 · 12 comments · Fixed by #982
Labels
B - bug Dang, that shouldn't have happened C - needs investigation Issue must be confirmed and researched DS - x11 P - high Vital to have
Milestone

Comments

@chrisduerr
Copy link
Contributor

So I've just ported Alacritty to the EventLoop 2.0 rework and in general it has been surprisingly painless. However it does seem like there was a regression which is (hopefully) not caused by errors during the porting process.

It seems like when using run_return to check for events, it might under some circumstances swallow certain events. This can be reproduced by printing continuous output (like yes) and trying to cancel it by pressing Ctrl+C (which is handled as a ReceivedChar event). This event is never received. It's possible to print all glutin events with the --print-events flag. The only received events are EventsCleared and LoopDestroyed.

These events are not always swallowed, but it appears only to happen when the event loop is called repeatedly without any events. Under normal circumstances when Alacritty is idle and the control flow is Wait, it seems like events are received properly (so the event loop is actively running, waiting for events).

The EventLoop 2.0 branch for Alacritty can be found here:
alacritty/alacritty#2438

@elinorbgr
Copy link
Contributor

On which platform(s) does it occur?

@chrisduerr
Copy link
Contributor Author

This was tested only on Linux/X11. I can try macOS tomorrow to see if it's only one specific platform or a more general issue.

@goddessfreya goddessfreya added DS - x11 C - needs investigation Issue must be confirmed and researched B - bug Dang, that shouldn't have happened labels May 14, 2019
@Osspial Osspial added this to the EventLoop 2.0 milestone May 14, 2019
@chrisduerr
Copy link
Contributor Author

I could not reproduce this issue on Windows. MacOS does not run successfully since the event loop is apparently not ported yet.

So this was only reproduced on Linux/X11 (have not tested Wayland).

@elinorbgr
Copy link
Contributor

MacOS does not run successfully since the event loop is apparently not ported yet.

MacOS port of the eventloop was merged, so it should work, maybe glutin needs to update its git reference though.

Anyway, this looks like its an x11 problem.

@aloucks
Copy link
Contributor

aloucks commented Jun 1, 2019

I'm seeing this on Linux/X11 as well (have not tested wayland).

I can reproduce the issue with the following code. I don't get any events printed while resizing the window. If I comment out the thread::sleep line, I'll start seeing resized events.

extern crate winit;
use std::time::{Instant, Duration};

use winit::window::WindowBuilder;
use winit::event::{Event, WindowEvent};
use winit::event_loop::{EventLoop, ControlFlow};

fn main() {
    let event_loop = EventLoop::new();

    let window = WindowBuilder::new()
        .with_title("A fantastic window!")
        .build(&event_loop)
        .unwrap();

    event_loop.run(move |event, _, control_flow| {
        match event {
            Event::WindowEvent {
                event: WindowEvent::CloseRequested,
                ..
            } => *control_flow = ControlFlow::Exit,
            Event::EventsCleared => {
                window.request_redraw();
            },
            Event::WindowEvent { event: WindowEvent::Resized(_), .. } => {
                println!("resized: {:?}", Instant::now());
            },
            Event::WindowEvent { event: WindowEvent::RedrawRequested, .. } => {
                // simulate drawing
                std::thread::sleep(Duration::from_millis(16));
            }
            _ => ()
        }
    });
}

@chrisduerr
Copy link
Contributor Author

I'd also like to mention that I can not reproduce this anymore after changing the event loop to run permanently, instead of running synchronously with other stuff in the application.

So the issue definitely seems to be with events coming in while the event loop is no active, so it only affects X11 with the run_return method for me. This is specifically about the Ctrl+C issue, I'm not sure if @aloucks problem is related or not.

@aloucks
Copy link
Contributor

aloucks commented Jun 3, 2019

I've also noticed some weird behavior where events seem to get backed up and then delivered all once. I suspect that the bug lies somewhere in calloop or or how mio is being used. I tried tweaking the PollOpt and Ready options, but haven't found the magic combo yet.

My understanding of how all this works is pretty limited but I did find this:

https://docs.rs/mio/0.6.19/mio/struct.Poll.html#edge-triggered-and-level-triggered

If when the socket was registered with Poll, edge triggered events were requested, then the call to Poll::poll done in step 5 will (probably) hang despite there being another 1kb still present in the socket read buffer. The reason for this is that edge-triggered mode delivers events only when changes occur on the monitored Evented. So, in step 5 the caller might end up waiting for some data that is already present inside the socket buffer.

With edge-triggered events, operations must be performed on the Evented type until WouldBlock is returned. In other words, after receiving an event indicating readiness for a certain operation, one should assume that Poll::poll may never return another event for the same token and readiness until the operation returns WouldBlock.

@elinorbgr
Copy link
Contributor

elinorbgr commented Jun 3, 2019

More likely I guess I was too optimistic when switching to driving x11's event loop to calloop. I guess libx11's is not fully asynchronous like libwayland's, as a result it's possible some other function calls to libx11 are interfering with calloop's dispatching of events.

This is far beyond my x11 knowledge though, so if someone more familiar with x11 API can clear things up, that'd be quite welcome.

@Osspial
Copy link
Contributor

Osspial commented Jun 20, 2019

I'm pinning this issue to increase its visibility, as it must be fixed before we can do a full EventLoop 2.0 release.

@murarth
Copy link
Contributor

murarth commented Jun 23, 2019

I did some digging into this issue and I found that the XFlush call here is reading data from the X11 socket. This appears to be why the event loop does not detect readable data on the socket and run the source callback to process events.

If the XFlush call is removed, events are processed immediately. However, this might interfere with the normal operation of a typical application. Perhaps some testing can be done or someone more familiar with X11 can offer a solution?

@Osspial
Copy link
Contributor

Osspial commented Jun 25, 2019

Looking at the docs, removing XFlush shouldn't cause any harm. @vberger do you know any reason that call was there in the first place?

murarth added a commit to murarth/winit that referenced this issue Jun 25, 2019
Internally, `XFlush` calls `_XSend` to write data. It then calls
`XEventsQueued(display, QueuedAfterReading)`, which reads data from the
X server connection. This prevents the event loop source callback from
being run, as there is no longer data waiting on the socket.

Ideally, we would want to call `_XSend` directly to ensure that no
output is buffered by Xlib. However, this function is not exported as
part of Xlib's public API.

Testing with the `XFlush` call removed does not appear to adversely
affect the performance of an application. If any bugs should eventually
arise from this change, perhaps another function may be used in place of
`XFlush`, such as `XPending`, which writes buffered output but does not
so aggressively read from the X server connection.

Closes rust-windowing#865
@elinorbgr
Copy link
Contributor

There was already a call to XFlush in the previous code (see here), so I tried to port it at the place where it made most sense. 🤷‍♂️

I did it with my instinct from working on Wayland though, where the reasoning is that in general you should flush the outgoing socket before waiting for events to make sure the server has received all your requests and will process them.

goddessfreya pushed a commit that referenced this issue Jun 25, 2019
Internally, `XFlush` calls `_XSend` to write data. It then calls
`XEventsQueued(display, QueuedAfterReading)`, which reads data from the
X server connection. This prevents the event loop source callback from
being run, as there is no longer data waiting on the socket.

Ideally, we would want to call `_XSend` directly to ensure that no
output is buffered by Xlib. However, this function is not exported as
part of Xlib's public API.

Testing with the `XFlush` call removed does not appear to adversely
affect the performance of an application. If any bugs should eventually
arise from this change, perhaps another function may be used in place of
`XFlush`, such as `XPending`, which writes buffered output but does not
so aggressively read from the X server connection.

Closes #865
@Osspial Osspial unpinned this issue Jun 26, 2019
kosyak pushed a commit to kosyak/winit that referenced this issue Jul 10, 2019
Internally, `XFlush` calls `_XSend` to write data. It then calls
`XEventsQueued(display, QueuedAfterReading)`, which reads data from the
X server connection. This prevents the event loop source callback from
being run, as there is no longer data waiting on the socket.

Ideally, we would want to call `_XSend` directly to ensure that no
output is buffered by Xlib. However, this function is not exported as
part of Xlib's public API.

Testing with the `XFlush` call removed does not appear to adversely
affect the performance of an application. If any bugs should eventually
arise from this change, perhaps another function may be used in place of
`XFlush`, such as `XPending`, which writes buffered output but does not
so aggressively read from the X server connection.

Closes rust-windowing#865
@Osspial Osspial removed the I - BLOCKING RELEASE Prevents a new release label Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B - bug Dang, that shouldn't have happened C - needs investigation Issue must be confirmed and researched DS - x11 P - high Vital to have
Development

Successfully merging a pull request may close this issue.

6 participants