Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gtk3 applications crash when trying to open a GtkMenu #210

Closed
jplatte opened this issue Nov 27, 2016 · 43 comments
Closed

gtk3 applications crash when trying to open a GtkMenu #210

jplatte opened this issue Nov 27, 2016 · 43 comments

Comments

@jplatte
Copy link

jplatte commented Nov 27, 2016

So after first submitting this issue on sway, I have tried and succeeded in reproducing it with orbment:
When I open a menu in a gtk3 application (e.g. right click in a text view, pressing the hamburger button in Epiphany), the application terminates. This only happens with GtkMenus, not with the more recently added GtkPopovers. When using the gtk3 package from the official arch repositories, these crashes look like this:

(gedit:10785): Gdk-ERROR **: Error flushing display: Broken pipe
[1]    10785 trace trap (core dumped)  LC_ALL=C gedit

Weirdly, when using gtk3 with the typeahead patch (gtk3-typeahead in AUR) I don't get a core dump, and the error message looks different:

# LANG=en_US.utf8
Gdk-WARNING **: Lost connection to Wayland compositor.

# LANG=de_DE.utf8
Gdk-WARNING **: Error 71 (Protokollfehler) dispatching to Wayland display.

I'm about to rebuild the official gtk3 package with debug symbols to hopefully obtain a stack trace.

@jplatte
Copy link
Author

jplatte commented Dec 4, 2016

So I've now tried the same thing with a self-built version of gtk3 without patches, with debugging symbols enabled, and have managed to get a stack trace from one of the Broken pipe error cases (I got the other one as well a few times). This is a stack trace I got from gedit:

#0  0x00007ffff4767ff1 in  () at /usr/lib/libglib-2.0.so.0
#1  0x00007ffff476a731 in g_log_writer_default () at /usr/lib/libglib-2.0.so.0
#2  0x00007ffff4768b8c in g_log_structured_array () at /usr/lib/libglib-2.0.so.0
#3  0x00007ffff4768e89 in g_log_structured () at /usr/lib/libglib-2.0.so.0
#4  0x00007ffff58cd7a7 in gdk_event_source_prepare (base=0x67db50, timeout=<optimized out>) at gdkeventsource.c:66
#5  0x00007ffff4761c89 in g_main_context_prepare () at /usr/lib/libglib-2.0.so.0
#6  0x00007ffff47626ab in  () at /usr/lib/libglib-2.0.so.0
#7  0x00007ffff476289c in g_main_context_iteration () at /usr/lib/libglib-2.0.so.0
#8  0x00007ffff780654d in g_application_run () at /usr/lib/libgio-2.0.so.0
#9  0x0000000000400c2a in main ()

@jplatte
Copy link
Author

jplatte commented Dec 4, 2016

Here's the gdk_event_soure_prepare function from my local version of the source, with the line of the error message marked:

  static gboolean
  gdk_event_source_prepare (GSource *base,
                            gint    *timeout)
  {
    GdkWaylandEventSource *source = (GdkWaylandEventSource *) base;
    GdkWaylandDisplay *display = (GdkWaylandDisplay *) source->display;

    *timeout = -1;

    if (source->display->event_pause_count > 0)
      return _gdk_event_queue_find_first (source->display) != NULL;

    /* We have to add/remove the GPollFD if we want to update our
     * poll event mask dynamically.  Instead, let's just flush all
     * write on idle instead, which is what this amounts to.
     */

    if (_gdk_event_queue_find_first (source->display) != NULL)
      return TRUE;

    /* wl_display_prepare_read() needs to be balanced with either
     * wl_display_read_events() or wl_display_cancel_read()
     * (in gdk_event_source_check() */
    if (source->reading)
      return FALSE;

    /* if prepare_read() returns non-zero, there are events to be dispatched */
    if (wl_display_prepare_read (display->wl_display) != 0)
      return TRUE;
    source->reading = TRUE;

    if (wl_display_flush (display->wl_display) < 0)
*     g_error ("Error flushing display: %s", g_strerror (errno));

    return FALSE;
  }

@jplatte
Copy link
Author

jplatte commented Dec 9, 2016

@Cloudef I really don't know much about window managers and such, but I guess I could try debugging this.. Any pointers on where to start? (I made sure this doesn't happen with gnome-shell, so it seems pretty clear that the next step is actually debugging a window manager build with wlc)

@Cloudef
Copy link
Owner

Cloudef commented Dec 9, 2016

The problem is most likely that xdg is not fully implemented in wlc. For example it's lacking the positioner interfaces. wlc should log about unimplemented functions.

@jplatte
Copy link
Author

jplatte commented Dec 9, 2016

Ah yeah. Ran orbment with logging, and got

[17:16:09.192] (WARN) wlc: xdg_cb_create_positioner @ line 140 is not implemented

in the log file. I'll look through the code and see if I can't implement this. Gonna be quite a bit of spec-reading I imagine..

@Drakulix
Copy link
Contributor

Drakulix commented Jan 3, 2017

@Cloudef
Because this issue is really annoying me in my daily use, I would like to try to tackle it.

I browsed through the source code and got some questions:

  • How would you like to see the positioner getting exposed by the api? It seems like it should almost be a new handle type. Alternatively don't expose it and just respect it's settings in the set_geometry call or stub it so it does not crash anymore.
  • Would the positioner be a new resource in src/resources/types? How would I allocate it in the create_positioner call? Looking at the rest of the code, I just use wlc_resource_create and wlc_resource_implement, right? Implementing it then seems pretty straight-forward.
  • Anything I am missing?

@Cloudef
Copy link
Owner

Cloudef commented Jan 4, 2017

@Drakulix Will get on this bit later today. I need to remind myself on the positioner details before I can answer well.

@Cloudef
Copy link
Owner

Cloudef commented Jan 7, 2017

@Drakulix Sorry for late reply:

How would you like to see the positioner getting exposed by the api? It seems like it should almost be a new handle type. Alternatively don't expose it and just respect it's settings in the set_geometry call or stub it so it does not crash anymore.

If we can keep it out of api that would be great.

Would the positioner be a new resource in src/resources/types? How would I allocate it in the create_positioner call?

Yes, you'll put the implementation of positioner methods there.

Looking at the rest of the code, I just use wlc_resource_create and wlc_resource_implement, right? Implementing it then seems pretty straight-forward.

xdg-shell.c will be the parent that calls the wlc_resource_create and holds the positioner object. You need to add new wlc_source for positioners in xdg_shell's struct, and initialize it in wlc_xdg_shell function in xdg-shell.c, the positioner resources will be then tracked and constructor / destructors (if any) called automatically when resources are created / destroyed from that source.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Hi there.

Funny thing about this positioner thing. I implemented it, just with implementation that creates positioner that stores nothing so far, and I found that GTK apps are crashing anyway:

zxdg_popup_v6@29: error 1: xdg_cb_popup_grab @ line 27 is not implemented

Is there any xdg_cb_popup_grab implementation around?

kozec added a commit to kozec/wlc that referenced this issue Jan 9, 2017
@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

Nope, no popups grabs, more things that needs to be implemented. Anything that creates a wayland resource can't be simply stubbed (they at least need dummy resource) or the client will indeed abort.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Actually, just commenting out stub in xdg_cb_popup_grab "fixes" weston-terminal. Gnome-terminal menu is automatically dismissed, but at least it doesn't crash.

Looks like I need to find what's popup grab supposed to do.

//edit:

Anything that creates a wayland resource can't be simply stubbed (they at least need dummy resource)

Implementation above creates dummy resource.

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

Actually, just commenting out stub in xdg_cb_popup_grab "fixes" weston-terminal. Gnome-terminal menu is automatically dismissed, but at least it doesn't crash.

Yeah "stub" is something that throws wayland protocol error. There is another macro "stubl" which just logs warning but doesn't throw protocol error.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Funny thing is, that entire xdg_cb_popup_grab thing looks like reimplementation of X11 bug where application can basically disable user's keyboard, including A-C-Backspace and A-C-Del -_-

It also doesn't looks like compositor can ignore that request. Only other option is to dismiss popup automatically.

What are you doing in similar cases usually? :D

kozec added a commit to kozec/wlc that referenced this issue Jan 9, 2017
@Drakulix
Copy link
Contributor

Drakulix commented Jan 9, 2017

It also doesn't looks like compositor can ignore that request. Only other option is to dismiss popup automatically.

What happens if you do neither in the implementation?

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

What happens if you do neither in the implementation?

GtkMenu and GtkPopup flashes on screen for a moment and then disappears forever. I can control them blidnly using keyboard, but they are not drawn anywhere.

Of course, there is no error from GTK anywhere...

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

It also doesn't looks like compositor can ignore that request. Only other option is to dismiss popup automatically.

Think that is just weston specific behaviour. It won't allow multible grabs. (Probably consistend with X11 perhaps? Remember the bug where screen lock / screensaver won't trigger if popup is open)

GtkMenu and GtkPopup flashes on screen for a moment and then disappears forever. I can control them blidnly using keyboard, but they are not drawn anywhere.

May also be bug in wlc somewhere.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Think that is just weston specific behaviour. It won't allow multible grabs. (Probably consistend with X11 perhaps? Remember the bug where screen lock / screensaver won't trigger if popup is open)

That behavior is actually defined in xdg-shell-unstable-v6.xml. "If the compositor denies the grab, the popup will be immediately dismissed."

This may be pretty big problem, because while allowing multiple grabs may not be necessary, GTK-based stuff clearly depends on possibility to have at least one.

// edit: I caputred some log from gnome-text-editor exposing same behavior.

http://pastebin.com/BgweN4rc

Maybe problem is really caused only by not having positioner (and setting size) implemented?

// edit: Yep,

LogType.WARN xdg_cb_popup_grab @ line 27 is not implemented
zxdg_popup_v6_send_configure 0,0 0x0
zxdg_popup_v6_send_configure 0,0 183x243
on_view_request_geometry 4 -> (0, 0) :: (171L, 231L)
zxdg_popup_v6_send_configure 100,100 171x231
zxdg_popup_v6_send_configure 100,100 171x231
on_view_request_geometry 4 -> (100, 100) :: (0L, 0L)
zxdg_popup_v6_send_configure 100,100 0x0
zxdg_popup_v6_send_configure 100,100 0x0

Now only if I knew where that 0x0 size comes from...

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

That behavior is actually defined in xdg-shell-unstable-v6.xml. "If the compositor denies the grab, the popup will be immediately dismissed."

It's still up to compositor to decide what denying a grab means though?

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

It's still up to compositor to decide what denying a grab means though?

May be, but it will block user from using menus in any case :D

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Ok, I got menus to behave simply by restricting minimal view size, so I believe that respecting data in positioner should fix the problem.

But I'm wondering about keeping it from API discussed above. Shouldn't positioner data be something available to request_geometry_cb? Otherwise it will be next to impossible to keep menus and popups displayed on place where user expects them.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Sooo... http://i.imgur.com/iJVmx8S.png

I added two APIs, wlc_view_get_requested_size and wlc_view_get_requested_anchor_rect. Positioner now remembers everything, but rest is not accessible so far.

There is just one problem left. zxdg_popup_v6_send_configure is called before request_geometry_cb, what breaks GTK (again). It means that menu gets zxdg_popup_v6_send_configure with zero size first and configured, correct sizes later, but GTK apparently uses only first recieved values.

I added one ugly hack to prevent that, but someone with better idea about how are callbacks called may be able to fix problem entirely by calling request_geometry_cb first.

@Drakulix
Copy link
Contributor

Drakulix commented Jan 9, 2017

I added two APIs, wlc_view_get_requested_size and wlc_view_get_requested_anchor_rect. Positioner now remembers everything, but rest is not accessible so far.

What exactly is requested size? Maybe we should expose min and max_size instead. If those are identical the requested size is quite obvious for the compositor.

I would call wlc_view_get_requested_anchor_rect just wlc_view_get_anchor_rect. The protocol makes it a hard constraint, its just up to the compositor to ignore that.

The rest should also be accessible to make correct positioning possible.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

What exactly is requested size? Maybe we should expose min and max_size instead. If those are identical the requested size is quite obvious for the compositor.

requested size is what positioner.set_size defines in specs. I don't think there is any min and max size for positioner. Basically, it's size that client expect to get, menu window size in case I'm working with.

I would call wlc_view_get_requested_anchor_rect just wlc_view_get_anchor_rect. The protocol makes it a hard constraint, its just up to the compositor to ignore that.

No problem with that, but positioner contains "requests" for position, size, offset, gravity, anchor and constraint_adjustments, so I thought having common prefix for all of that would look less messy. I'm sure that at least view_get_size and view_get_position already exists.

By the way, how are you exposing wayland enums in API? What should return wlc_view_get_anchor, internally represented as zxdg_positioner_v6_anchor ?

@Drakulix
Copy link
Contributor

Drakulix commented Jan 9, 2017

No problem with that, but positioner contains "requests" for position, size, offset, gravity, anchor and constraint_adjustments

Good point. The name just clutters with the wlc_view_requests_geometry_cb and the other callbacks a like.
So I think those should be named differently

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Ok, how about wlc_view_get_positioner_anchor_rect, wlc_view_get_positioner_size etc?

There is also possibility to create just wlc_view_get_positioner_data and return this entire thing.

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

That is a contradiction to calling my idea to be good, because that would mean the compositor is allowed to place the popup anyway, including outside of its parent.

You are right, I misunderstood your original post as the anchor rect getter being more like request. Should have read the first words more properly.

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

By the way, how are you exposing wayland enums in API? What should return wlc_view_get_anchor, internally represented as zxdg_positioner_v6_anchor ?

The enums are wrapped in wlc. Only time developer has to know anything about the wayland headers (or the extensions) is when he/she is about to use the wlc-wayland.h and implement wayland extension.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

The enums are wrapped in wlc. Only time developer has to know anything about the wayland headers (or the extensions) is when he/she is about to use the wlc-wayland.h and implement wayland extension.

Heh. This is going to be fun.

Okay, and is there any problem with having one getter for entire positioner structure?

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

Okay, and is there any problem with having one getter for entire positioner structure?

Well, if you want one, then that would be ABI stability. If the struct contents change often it's going to cause version incompatiblity.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

If the struct contents change often

What's something that depends on changes in Wayland. Got it :(

@Drakulix
Copy link
Contributor

Drakulix commented Jan 9, 2017

Well, isn't the API expected to be adjusted to breaking protocol changes anyway? Keeping backwards compatibility at any price for unstable protocols is not helpful either.

Also additions to the struct are not breaking changes. Like the xdg-shell v6 changes, that mostly introduced new features.

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

Just saying, wlc isn't marked as stable either. Whether you want to keep the api opaque or not is up to you.

@ddevault
Copy link
Collaborator

ddevault commented Jan 9, 2017

For the sake of usability it makes sense to support both at the same time imo. And I don't imagine that codewise that would be too terribly difficult tbh.

@ddevault
Copy link
Collaborator

ddevault commented Jan 9, 2017

Only because not everyone has switched to v6, by the way. I would wager about half are still on 5.

@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

To make that decision easier, I already have all wlc_view_positioner_get_* implemented and now I'm solving fact that for compositor to be able to compute position, wlc needs to store popup parent.

kozec added a commit to kozec/wlc that referenced this issue Jan 9, 2017
kozec added a commit to kozec/wlc that referenced this issue Jan 9, 2017
@kozec
Copy link
Contributor

kozec commented Jan 9, 2017

Here. With above, GTK menus are shown without crashing and compositor can display them correctly positioned.

@Cloudef
Copy link
Owner

Cloudef commented Jan 9, 2017

Views already have parent / child relationship, do the popups really need own?

@kozec
Copy link
Contributor

kozec commented Jan 10, 2017

Views already have parent / child relationship, do the popups really need own?

Nope, I was not aware of that.

//edit: Anyway, that only means that last commit is not needed, just using view_get_parent works. I can tidy rest up and make PR.

@Fale
Copy link

Fale commented Jan 15, 2017

A Fedora user is experiencing the same bug as for https://bugzilla.redhat.com/show_bug.cgi?id=1410900. (I'm reporting this to keep all the info about this linked)

@kamiyaa
Copy link

kamiyaa commented Jan 15, 2017

I'm having the same issue with gtk+ 3.22.x. Downgrading to gtk+ 3.20.x seems to fix the problem, though gtk+ 3.20 has problems of its own as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants