Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unloading Rust-made SOs can lead to segfaults in host program #52138

Closed
SoniEx2 opened this issue Jul 7, 2018 · 12 comments
Closed

Unloading Rust-made SOs can lead to segfaults in host program #52138

SoniEx2 opened this issue Jul 7, 2018 · 12 comments

Comments

@SoniEx2
Copy link
Contributor

SoniEx2 commented Jul 7, 2018

Steps to reproduce:

  1. Have two versions of libplugin_written_in_rust.so (rust always adds that "lib" prefix for some reason)
  2. Load one of them
  3. Unload it
  4. Load the other
  5. ???
  6. Segfault

I don't seem to get the crashes if I have two different plugins under different names.

This is an issue for the hexchat-plugin crate, and (probably) anything that lets you write plugins in Rust. (rlua? I haven't entirely tested this)

(No, this isn't an issue with hexchat's module system. Hexchat's module system is basically just dlopen and dlclose, at least on Linux.)

(No, this shouldn't be dismissed as "unsafe and dangerous"/"unsupported", because if you have heap-allocated globals in your rust code, all you're gonna get is some memory/resource leaks, which aren't considered unsafe. This really shouldn't be segfaulting.)

$ rustc --version --verbose
rustc 1.27.0
binary: rustc
commit-hash: unknown
commit-date: unknown
host: x86_64-unknown-linux-gnu
release: 1.27.0
LLVM version: 6.0

(Note: this has been a thing for a while, it's not new in 1.27.0.)

Other (important?) things to note:

  • Only happens if code is changed. Changing strings doesn't seem to trigger the issue (or I was just very (un)lucky). (but maybe optimizations can have an effect, for example dead code elimination based on a constant?)
  • Happens in both debug and release builds.
@Mark-Simulacrum
Copy link
Member

Could we get a backtrace from the segfault? It would also be helpful to have a minimal repository or steps to follow so that we can reproduce this ourselves.

Also, the rustc you're using looks somewhat odd -- I would expect something like "rustc 1.27.0 (3eda71b 2018-06-19)" for the first line, is that from a distro or locally built? (Not to say this is the cause, it probably isn't)

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Jul 7, 2018

I am on arch linux x86_64, using rust and hexchat from distro.

@ishitatsuyuki
Copy link
Contributor

Are you sure that you have no pointers to code inside the first DLL when unloading? vtables or function tables tends to have dangling reference that are hard to notice.

@nagisa
Copy link
Member

nagisa commented Jul 8, 2018

This is a commonly reported issue against libloading. The issue is usually related to thread locals.

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Jul 8, 2018

I'm not using libloading anywhere. The host is using #include <dlfcn.h>.

@nagisa
Copy link
Member

nagisa commented Jul 8, 2018

See #28794 for the OS X equivalent and nagisa/rust_libloading#41 for a report of such an issue occurring on a linux system.

I'm not using libloading anywhere.

I did not say this is a libloading issue. I said it is commonly reported against libloading. And that this is related to thread locals.

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Jul 8, 2018

Oh. Hexchat is single-threaded. This also doesn't happen with a test plugin I made in C.

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Jul 8, 2018

Also note that reloading the same SO over and over never segfaults. I've tested it upwards of 20 times in a row and it didn't segfault.

@nagisa
Copy link
Member

nagisa commented Jul 8, 2018

A backtrace would quickly tell what exactly is or is not a problem. Would be great if you obtained one.

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Sep 24, 2018

Please keep your thread unsafety issues out of this issue.

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Oct 15, 2018

I'm not sure if I've linked this stack trace anywhere, but:

Oct 15 16:31:39 soniex-pc systemd-coredump[17515]: Process 16744 (hexchat) of user 1000 dumped core.
                                                   
                                                   Stack trace of thread 16744:
                                                   #0  0x00007f812b97783a check_match (ld-linux-x86-64.so.2)
                                                   #1  0x00007f812b977d74 do_lookup_x (ld-linux-x86-64.so.2)
                                                   #2  0x00007f812b97860f _dl_lookup_symbol_x (ld-linux-x86-64.so.2)
                                                   #3  0x00007f812b88e85d do_sym (libc.so.6)
                                                   #4  0x00007f81294c22c9 n/a (libdl.so.2)
                                                   #5  0x00007f812b88ee77 _dl_catch_exception (libc.so.6)
                                                   #6  0x00007f812b88ef13 _dl_catch_error (libc.so.6)
                                                   #7  0x00007f81294c28bf n/a (libdl.so.2)
                                                   #8  0x00007f81294c2333 dlsym (libdl.so.2)
                                                   #9  0x00007f812b438304 g_module_symbol (libgmodule-2.0.so.0)
                                                   #10 0x000056472801843c plugin_load (hexchat)
                                                   #11 0x000056472801eb4d n/a (hexchat)
                                                   #12 0x0000564728018b91 handle_command (hexchat)
                                                   #13 0x0000564728019a4b handle_multiline (hexchat)
                                                   #14 0x00005647280584d4 mg_inputbox_cb (hexchat)
                                                   #15 0x00007f812b58f3d5 g_closure_invoke (libgobject-2.0.so.0)
                                                   #16 0x00007f812b57c195 n/a (libgobject-2.0.so.0)
                                                   #17 0x00007f812b5826e0 g_signal_emitv (libgobject-2.0.so.0)
                                                   #18 0x00007f812ae7287b n/a (libgtk-x11-2.0.so.0)
                                                   #19 0x00007f812ae72d62 n/a (libgtk-x11-2.0.so.0)
                                                   #20 0x00007f812ae73270 n/a (libgtk-x11-2.0.so.0)
                                                   #21 0x00007f812ae74148 gtk_bindings_activate_event (libgtk-x11-2.0.so.0)
                                                   #22 0x00007f812aeba421 n/a (libgtk-x11-2.0.so.0)
                                                   #23 0x00007f812af267cc n/a (libgtk-x11-2.0.so.0)
                                                   #24 0x00007f812b58f2d2 g_closure_invoke (libgobject-2.0.so.0)
                                                   #25 0x00007f812b57b99f n/a (libgobject-2.0.so.0)
                                                   #26 0x00007f812b57f5ed g_signal_emit_valist (libgobject-2.0.so.0)
                                                   #27 0x00007f812b580a80 g_signal_emit (libgobject-2.0.so.0)
                                                   #28 0x00007f812b041235 n/a (libgtk-x11-2.0.so.0)
                                                   #29 0x00007f812b0550d7 gtk_window_propagate_key_event (libgtk-x11-2.0.so.0)
                                                   #30 0x00007f812b057cbb n/a (libgtk-x11-2.0.so.0)
                                                   #31 0x00007f812af267cc n/a (libgtk-x11-2.0.so.0)
                                                   #32 0x00007f812b58f3d5 g_closure_invoke (libgobject-2.0.so.0)
                                                   #33 0x00007f812b57b99f n/a (libgobject-2.0.so.0)
                                                   #34 0x00007f812b57f5ed g_signal_emit_valist (libgobject-2.0.so.0)
                                                   #35 0x00007f812b580a80 g_signal_emit (libgobject-2.0.so.0)
                                                   #36 0x00007f812b041235 n/a (libgtk-x11-2.0.so.0)
                                                   #37 0x00007f812af24adf gtk_propagate_event (libgtk-x11-2.0.so.0)
                                                   #38 0x00007f812af24e43 gtk_main_do_event (libgtk-x11-2.0.so.0)
                                                   #39 0x00007f812ab9dd5e n/a (libgdk-x11-2.0.so.0)
                                                   #40 0x00007f812b4a83cf g_main_context_dispatch (libglib-2.0.so.0)
                                                   #41 0x00007f812b4a9f89 n/a (libglib-2.0.so.0)
                                                   #42 0x00007f812b4aaf62 g_main_loop_run (libglib-2.0.so.0)
                                                   #43 0x00007f812af23df3 gtk_main (libgtk-x11-2.0.so.0)
                                                   #44 0x000056472806992a fe_main (hexchat)
                                                   #45 0x000056472800281a main (hexchat)
                                                   #46 0x00007f812b77b223 __libc_start_main (libc.so.6)
                                                   #47 0x000056472800298e _start (hexchat)
                                                   
                                                   Stack trace of thread 16745:
                                                   #0  0x00007f812b847bb1 __poll (libc.so.6)
                                                   #1  0x00007f812b4a9ee0 n/a (libglib-2.0.so.0)
                                                   #2  0x00007f812b4a9fce g_main_context_iteration (libglib-2.0.so.0)
                                                   #3  0x00007f812b4aa022 n/a (libglib-2.0.so.0)
                                                   #4  0x00007f812b4733eb n/a (libglib-2.0.so.0)
                                                   #5  0x00007f812b922a9d start_thread (libpthread.so.0)
                                                   #6  0x00007f812b852a43 __clone (libc.so.6)
                                                   
                                                   Stack trace of thread 16746:
                                                   #0  0x00007f812b847bb1 __poll (libc.so.6)
                                                   #1  0x00007f812b4a9ee0 n/a (libglib-2.0.so.0)
                                                   #2  0x00007f812b4aaf62 g_main_loop_run (libglib-2.0.so.0)
                                                   #3  0x00007f812b60fc28 n/a (libgio-2.0.so.0)
                                                   #4  0x00007f812b4733eb n/a (libglib-2.0.so.0)
                                                   #5  0x00007f812b922a9d start_thread (libpthread.so.0)
                                                   #6  0x00007f812b852a43 __clone (libc.so.6)

System information:

$ rustc --version --verbose
rustc 1.29.0 (aa3ca1994 2018-09-11)
binary: rustc
commit-hash: aa3ca1994904f2e056679fce1f185db8c7ed2703
commit-date: 2018-09-11
host: x86_64-unknown-linux-gnu
release: 1.29.0
LLVM version: 7.0
$ hexchat -v
hexchat 2.14.2

@SoniEx2
Copy link
Contributor Author

SoniEx2 commented Oct 28, 2018

it's definitely the linker

@SoniEx2 SoniEx2 closed this as completed Oct 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants