Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on Linux Wayland #367

Closed
Tim-Paik opened this issue Dec 17, 2022 · 13 comments · Fixed by #369
Closed

Panic on Linux Wayland #367

Tim-Paik opened this issue Dec 17, 2022 · 13 comments · Fixed by #369

Comments

@Tim-Paik
Copy link

command:

RUST_LOG=warn RUST_BACKTRACE=1 cargo run --release --example counter

I extracted some logs that may be useful:

[2022-12-17T09:09:17Z WARN  wgpu_hal::vulkan::instance] Disabling presentation on 'Intel(R) UHD Graphics 630 (CFL GT2)' (id 0x564f61c4f9d0) because of NV Optimus (on Linux)
[2022-12-17T09:09:17Z WARN  wgpu_hal::gles::adapter] Detected skylake derivative running on mesa i915. Clears to srgb textures will use manual shader clears.
[2022-12-17T09:09:17Z WARN  naga::front::spv] Unknown decoration Block
[2022-12-17T09:09:17Z WARN  naga::front::spv] Unknown decoration RelaxedPrecision
[2022-12-17T09:09:17Z WARN  naga::front::spv] Unused item decorations: **Too many, omitted here**
[2022-12-17T09:09:17Z WARN  wgpu_hal::gles::egl] Re-initializing Gles context due to Wayland window
thread 'main' panicked at 'Error in Surface::configure: surface does not support the adapter's queue family', /home/********/.cargo/registry/src/mirrors.****.com/wgpu-0.14.2/src/backend/direct.rs:274:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: wgpu::backend::direct::Context::handle_error_fatal
   3: <wgpu::backend::direct::Context as wgpu::Context>::surface_configure
   4: kas_wgpu::window::Window<C,T>::new
   5: kas_wgpu::Toolkit<C,T>::add_boxed
   6: counter::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

System: Archlinux
DE: GNOME + Wayland

If anything else is needed, I'm happy to provide it

@dhardy
Copy link
Collaborator

dhardy commented Dec 17, 2022

Looks like wgpu failed. It requires some form of accelerated graphics support (see its readme for details; you may also try wgpu examples).

KAS doesn't currently have any alternative (see #287), so unless you can confirm that WGPU examples work on the same setup there's nothing to do here, sorry.

@dhardy dhardy closed this as completed Dec 17, 2022
@Tim-Paik
Copy link
Author

Tim-Paik commented Dec 17, 2022

wgpu

I'm well aware, so I just raised an issue in KAS after I tried the wgpu's example. I also tried to check the code of KAS, but I couldn't find any difference from the demo I wrote myself. If I have to say something different, it may be this:

let adapter = instance
            .request_adapter(&wgpu::RequestAdapterOptions {
                power_preference: wgpu::PowerPreference::default(),
                force_fallback_adapter: false,
                compatible_surface: Some(&surface),
            })
            .await
            .unwrap();

I set compatible_surface to Some(&surface) instead of None, because that would cause surface.get_supported_formats(&adapter) to be of length 0

@dhardy

@dhardy dhardy reopened this Dec 17, 2022
@dhardy
Copy link
Collaborator

dhardy commented Dec 17, 2022

Doing the same in KAS is awkward because the adapter is shared and created before any surface or window. It's not immediately clear how to solve this — delay construction of the adapter, or perhaps changes to wgpu are required?

@Tim-Paik
Copy link
Author

Because it panics at surface.configure, I’m actually not sure if it’s an error caused by compatible_surface (even though it does panic in my demo)
I tried a few things but couldn't solve this problem.
Perhaps, these things should not be shared? For example, the scale_factor of each screen may be different, the window should adapt to the scale_factor of the current screen?
I think having an Instance per window might be the right thing to do

@Tim-Paik
Copy link
Author

Tim-Paik commented Dec 17, 2022

Now that I've identified the culprit as compatible_surface: None, I reproduced it with a small demo:

fn main() {
    env_logger::init();
    pollster::block_on(run());
}

pub async fn run() {
    let event_loop = winit::event_loop::EventLoop::new();
    let window = winit::window::WindowBuilder::new()
        .build(&event_loop)
        .unwrap();
    let instance = wgpu::Instance::new(wgpu::Backends::all());
    let surface = unsafe { instance.create_surface(&window) };
    let adapter = instance
        .request_adapter(&wgpu::RequestAdapterOptions {
            compatible_surface: Some(&surface), /* Change this to None to cause Panic */
            ..Default::default()
        })
        .await
        .unwrap();
    let (device, _queue) = adapter
        .request_device(&wgpu::DeviceDescriptor::default(), None)
        .await
        .unwrap();

    let config = wgpu::SurfaceConfiguration {
        usage: wgpu::TextureUsages::RENDER_ATTACHMENT,
        format: wgpu::TextureFormat::Rgba8UnormSrgb,
        width: 800,
        height: 600,
        present_mode: wgpu::PresentMode::Fifo,
        alpha_mode: wgpu::CompositeAlphaMode::Auto,
    };
    surface.configure(&device, &config);
}

@dhardy
Copy link
Collaborator

dhardy commented Dec 17, 2022

For example, the scale_factor of each screen may be different

It's not "shared", it's simply a hint of the last known value. The window's target size must be specified when constructing it, which can only be calculated with a known scale factor (unless you use Wayland's method of scaling up from an "unscaled size" which requires re-sizing and isn't quite accurate). Both options are rough hacks.

So it looks like your issue may be solvable by delaying request_adapter until the first window is needed — another hack.

@dhardy
Copy link
Collaborator

dhardy commented Dec 17, 2022

I guess there are two lessons out of this:

  1. Multiple adapters are often available (e.g. on my system, Vulkan/Radeon, Vulkan/LLVMpipe, GL/Radeon). instance.request_adapter is a helper to select an appropriate adapter (e.g. using power preferences), and as a convenience it can also filter by compatibility with a given surface. Ideally we'd either delay adapter choice until we have a surface (not easy in KAS) or re-write this logic with additional constraints, though this may be unnecessary (see second point).
  2. When constructing a surface, we should check adapter capabilities and potentially alter the surface (e.g. format, present mode). The present mode should in any case be selectable.

@Tim-Paik
Copy link
Author

Since I'm new to rust, if you solve this problem, please also tell me how, I'd like to learn!

@dhardy
Copy link
Collaborator

dhardy commented Dec 19, 2022

@Tim-Paik from what I understand:

  • you are using NVidia Optimus over an Intel(R) UHD Graphics 630
  • without a compatible_surface constraint, WGPU selects the Intel GPU (presumably because of the "low power" preference)
  • for some reason WGPU selects the GLES backend instead of Vulkan (possibly Vulkan is not well supported by this iGPU, or possibly it's just arbitrary?)
  • WGPU complains: "surface does not support the adapter's queue family"

I don't exactly understand what this message means; it could be due to the TextureUsages.

Could you please set compatible_surface: None then tweak the SurfaceConfiguration to find what is/isn't supported? Thanks.

@Tim-Paik
Copy link
Author

I do use a notebook with dual graphics cards, but I haven't installed software like optimus-manager.

I tried using other backends in my demo above, and it turns out that the reason for using GL is simple: everything else will panic

When I set KAS_BACKENDS to GL, things start to change:

timpaikx@timpaikx ~/t/kas (master) [SIGSEGV]> RUST_BACKTRACE=1 KAS_BACKENDS=GL cargo run --example hello
    Finished dev [unoptimized + debuginfo] target(s) in 0.18s
     Running `target/debug/examples/hello`
thread 'main' panicked at 'Error in Surface::configure: requested present mode Mailbox is not in the list of supported present modes: [Fifo]', /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/backend/direct.rs:274:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: wgpu::backend::direct::Context::handle_error_fatal
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/backend/direct.rs:274:9
   3: <wgpu::backend::direct::Context as wgpu::Context>::surface_configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/backend/direct.rs:1017:13
   4: wgpu::Surface::configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/lib.rs:3715:9
   5: kas_wgpu::window::Window<C,T>::new
             at ./crates/kas-wgpu/src/window.rs:144:9
   6: kas_wgpu::Toolkit<C,T>::add_boxed
             at ./crates/kas-wgpu/src/lib.rs:228:19
   7: kas_wgpu::Toolkit<C,T>::with
             at ./crates/kas-wgpu/src/lib.rs:221:9
   8: hello::main
             at ./examples/hello.rs:14:5
   9: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fish: Job 1, 'RUST_BACKTRACE=1 KAS_BACKENDS=G…' terminated by signal SIGSEGV (Address boundary error)

I tried to get it in my demo, in wgpu::PresentMode, AutoNoVsync/AutoVsync/Fifo can run normally (GL backend)

I tried changing this to Fifo:

present_mode: wgpu::PresentMode::Mailbox,

then

timpaikx@timpaikx ~/t/kas (master) [SIGSEGV]> RUST_BACKTRACE=1 KAS_BACKENDS=GL cargo run --example hello
    Finished dev [unoptimized + debuginfo] target(s) in 0.18s
     Running `target/debug/examples/hello`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: BadDisplay', /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:294:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::result::unwrap_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1113:23
   4: wgpu_hal::gles::egl::EglContext::make_current
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:292:9
   5: wgpu_hal::gles::egl::AdapterContext::lock::{{closure}}
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:406:13
   6: core::option::Option<T>::map
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/option.rs:925:29
   7: wgpu_hal::gles::egl::AdapterContext::lock
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:405:19
   8: wgpu_hal::gles::egl::Surface::unconfigure_impl
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:1019:19
   9: <wgpu_hal::gles::egl::Surface as wgpu_hal::Surface<wgpu_hal::gles::Api>>::configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-hal-0.14.1/src/gles/egl.rs:1046:42
  10: wgpu_core::device::<impl wgpu_core::hub::Global<G>>::surface_configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-core-0.14.2/src/device/mod.rs:5220:17
  11: <wgpu::backend::direct::Context as wgpu::Context>::surface_configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/backend/direct.rs:1015:13
  12: wgpu::Surface::configure
             at /home/timpaikx/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.14.2/src/lib.rs:3715:9
  13: kas_wgpu::window::Window<C,T>::new
             at ./crates/kas-wgpu/src/window.rs:144:9
  14: kas_wgpu::Toolkit<C,T>::add_boxed
             at ./crates/kas-wgpu/src/lib.rs:228:19
  15: kas_wgpu::Toolkit<C,T>::with
             at ./crates/kas-wgpu/src/lib.rs:221:9
  16: hello::main
             at ./examples/hello.rs:14:5
  17: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fish: Job 1, 'RUST_BACKTRACE=1 KAS_BACKENDS=G…' terminated by signal SIGSEGV (Address boundary error)

@dhardy
Copy link
Collaborator

dhardy commented Dec 20, 2022

requested present mode Mailbox is not in the list of supported present modes: [Fifo]

I used Mailbox to work around a different bug, but probably Fifo is the better supported mode.

Result::unwrap()on anErr` value: BadDisplay

No idea what that is...


But, did you see my previous question about tweaking SurfaceConfiguration from your example above? I'd still like to know what does/doesn't work.

@Tim-Paik
Copy link
Author

Among other parameters, alpha_mode seems to only support [Opaque] (Auto) , format supports [Rgba8UnormSrgb, Bgra8UnormSrgb, Rgba16Float], usage only supports RENDER_ATTACHMENT.

If you want to use Mailbox, can AutoNoVsync solve that problem?

@dhardy
Copy link
Collaborator

dhardy commented Dec 21, 2022

Among other parameters, alpha_mode seems to only support [Opaque] (Auto) , format supports [Rgba8UnormSrgb, Bgra8UnormSrgb, Rgba16Float], usage only supports RENDER_ATTACHMENT.

These are what KAS uses anyway.

If you want to use Mailbox

This was selected to work around a bug; Fifo should work fine (aside from hang on exit after multiple windows are created on X11). So looks like this is the same issue as #368.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants