Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librvvm: Dynamic unloading doesn't deinit #97

Closed
LekKit opened this issue May 2, 2023 · 9 comments · Fixed by #102
Closed

librvvm: Dynamic unloading doesn't deinit #97

LekKit opened this issue May 2, 2023 · 9 comments · Fixed by #102
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@LekKit
Copy link
Owner

LekKit commented May 2, 2023

The issue

  • When using shared librvvm, unloading it leaves some global state initialized
  • This may cause segfaults from thread workers kept running, leaks, etc

Steps to reproduce

  • Run Retroarch with libretro port of RVVM (See Add libretro core #96), I.e. load the lib
  • Do some IO workloads, causing thread workers to run
  • Click close content, which unloads the shared lib
  • Segfault

Investigation

  • Global data is deinitialized via use of atexit(), which is fine for standalone VM but not for a lib

Workarounds

  • Omitting unloading the lib
  • Waiting for a while after VM destruction so that thread workers exit

Suggested fix

  • Implement rvvm_at_deinit() callbacks instead of atexit()
  • Expose rvvm_deinit() API that is called before unloading the lib
  • Perhaps mark it with GNU attributes for automatic deinit
@LekKit LekKit added bug Something isn't working documentation Improvements or additions to documentation labels May 2, 2023
@LekKit LekKit self-assigned this May 2, 2023
@LekKit
Copy link
Owner Author

LekKit commented Jun 19, 2023

Implemented call_at_deinit() and full_deinit(), and moved all the infrastructure to use it.
On GNU-compatible compilers (Clang, GCC, other LLVM-based) it should deinit automatically on library unload.

@iyzsong can you please verify it doesn't segfault anymore?

@ghost
Copy link

ghost commented Jun 21, 2023

well, it still segfault. full_deinit and thread_workers_terminate got called, but the worker threads are still running when it returned. if i understand correctly, in current code a worker thread doesn't terminate until THREAD_MAX_WORKER_IDLE timeout, and thread_workers_terminate doesn't wait it now (should be done by pthread_join or atomic flag check)...

@LekKit
Copy link
Owner Author

LekKit commented Jun 21, 2023

if i understand correctly, in current code a worker thread doesn't terminate

Ah sorry, makes sense. My local threadpool implementation is different, so I should upstream it soon and that's gonna be it

@LekKit
Copy link
Owner Author

LekKit commented Jun 21, 2023

Should be fixed now, the threadpool is properly joined on deinit.

@ghost
Copy link

ghost commented Jun 22, 2023

Well, first "Close Content" does deinit correctly, but then load a rvvm file and close again will segfault, i think that's a RetroArch issue..

@LekKit
Copy link
Owner Author

LekKit commented Jun 22, 2023

I wrote a simple program that explicitly loads librvvm, runs a VM an unloads it (all done in a loop).
It will segfault if the machine is still running (I.e. rvvm_pause_machine() or rvvm_free_machine() wasn't called). Otherwise it is fine.

UPD: introduced a deinit hook that reaps every running machine and prints a warn. I don't know any other way that could cause a segv for now.

UPD: need to rewrite eventloop a bit to be dlclose-safe

@LekKit
Copy link
Owner Author

LekKit commented Jun 22, 2023

Implemented proper eventloop termination in 7ef1adc. There are no places where thread_detach() is used anymore.
There are some serious race issues if you keep running machines around till the actual deinit, but it is undefined from the API standpoint (And raises a warning if that ever happens).

Is it working OK now?

@ghost
Copy link

ghost commented Jun 23, 2023

Yes, it's working, before and after commit 7ef1adc.

My previous issue was due to the second load by "History" will load a different rvvm_libretro.so than the latest built one... Its shutdown sequence rvvm_run_eventloop -> rvvm_free_machine -> full_deinit (auto called by DEINIT_ATTR) was correct (with a seperated vm_thread, which can be discarded now).

@LekKit
Copy link
Owner Author

LekKit commented Jun 23, 2023

Yes, it's working, before and after commit 7ef1adc

That commit fixes a race condition where the eventloop might run for a few spare moments after all machines have been shut down. I.e. without that commit there might be still a rare segv occurence, but now all is fixed.

Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
Status: Testing
Development

Successfully merging a pull request may close this issue.

1 participant