Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model version is corrupted on Windows #1303

Closed
ykomatsu opened this issue Nov 3, 2022 · 10 comments · Fixed by #1304 or #1308
Closed

Model version is corrupted on Windows #1303

ykomatsu opened this issue Nov 3, 2022 · 10 comments · Fixed by #1304 or #1308
Labels
type: bug Something isn't working

Comments

@ykomatsu
Copy link

ykomatsu commented Nov 3, 2022

version_pkg() panics on Windows.

On Windows 11 with Visual Studio 2022:

PS> fj-app
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Utf8Error { valid_up_to: 0, error_len: Some(1) }', crates\fj\src\version.rs:44:36
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\core\src\panicking.rs:142
   2: core::result::unwrap_failed
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\core\src\result.rs:1814
   3: fj::version::RawVersion::as_str
   4: fj_host::model::Model::evaluate
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'main' panicked at 'Expected channel to never disconnect', crates\fj-window\src\run.rs:61:29
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\core\src\panicking.rs:142
   2: winit::platform_impl::platform::event_loop::EventLoop<T>::run_return::{{closure}}
   3: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
   4: winit::platform_impl::platform::event_loop::runner::EventLoopRunner<T>::register_window
   5: winit::platform_impl::platform::event_loop::runner::EventLoopRunner<T>::move_state_to
   6: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
   7: winit::platform_impl::platform::event_loop::runner::EventLoopRunner<T>::catch_unwind
   8: winit::platform_impl::platform::event_loop::EventLoopThreadExecutor::execute_in_thread
   9: DispatchMessageW
  10: DispatchMessageW
  11: GetClassLongW
  12: KiUserCallbackDispatcher
  13: NtUserDispatchMessage
  14: DispatchMessageW
  15: winit::platform_impl::platform::event_loop::EventLoop<T>::run
  16: core::ptr::drop_in_place<core::option::Option<std::sync::mpsc::stream::Message<()>>>
  17: fj_window::run::run
  18: smallvec::SmallVec<A>::insert
  19: std::rt::lang_start::{{closure}}
  20: core::ops::function::impls::impl$2::call_once
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\core\src\ops\function.rs:280
  21: std::panicking::try::do_call
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:492
  22: std::panicking::try
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:456
  23: std::panic::catch_unwind
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panic.rs:137
  24: std::rt::lang_start_internal::closure$2
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\rt.rs:128
  25: std::panicking::try::do_call
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:492
  26: std::panicking::try
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panicking.rs:456
  27: std::panic::catch_unwind
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\panic.rs:137
  28: std::rt::lang_start_internal
             at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library\std\src\rt.rs:128
  29: main
  30: invoke_main
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  31: __scrt_common_main_seh
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  32: BaseThreadInitThunk
  33: RtlUserThreadStart
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
@hannobraun
Copy link
Owner

Thank you for opening this issue, @ykomatsu! I will look into it.

@hannobraun
Copy link
Owner

Hey @ykomatsu, I've merged a pull request that removes the source of this panic (#1304). I'd appreciate, if you could give this another try. I'm fairly certain that the panic is fixed, but I don't fully understand why it happened in the first place. There might be something else broken now.

@ykomatsu
Copy link
Author

ykomatsu commented Nov 3, 2022

@hannobraun Thanks!
The panic is fixed.
But version_pkg() fails to get the correct version string:

fornjot-version

ptr in RawVersion seems to be point a wrong address.
Is this issue Windows-specific?
fj-app works fine on macOS Ventura.

@hannobraun
Copy link
Owner

Thanks for the update, @ykomatsu!

The panic is fixed.
But version_pkg() fails to get the correct version string:

Okay, I feared something like that. The cause of the panic was that the version was not valid UTF-8. Now it no longer fails in that case, but replaces any of the non-UTF-8 with placeholder characters, which is what we're seeing in that error.

ptr in RawVersion seems to be point a wrong address.

Maybe! But then I wonder, why does this seem to only happen on Windows?

Another possibility is that something is going wrong with the environment variables that are supposed to transport the version during compilation.

Is this issue Windows-specific?
fj-app works fine on macOS Ventura.

Yes, this seems to be Windows-specific. It works fine on Linux too.


Could you try to compile the code that handles the version on Windows, with extra debug output enabled? If you clone the Fornjot repository, you can run this from the repository root:

cargo build -vv -p fj

On my system, this includes the following lines of output from the build script of the fj crate:

[fj 0.22.0] cargo:rerun-if-changed=/home/hanno/Projects/main/fornjot/.git/HEAD
[fj 0.22.0] cargo:rerun-if-changed=/home/hanno/Projects/main/fornjot/.git/refs/heads/version
[fj 0.22.0] cargo:rerun-if-env-changed=FJ_OFFICIAL_RELEASE
[fj 0.22.0] cargo:rustc-env=FJ_VERSION_PKG=0.22.0
[fj 0.22.0] cargo:rustc-env=FJ_VERSION_FULL=0.22.0 (eae922bdd, unreleased)
[fj 0.22.0] cargo:rerun-if-changed=Cargo.toml

I wonder what that output is on Windows.

@hannobraun hannobraun reopened this Nov 3, 2022
@hannobraun hannobraun changed the title version_pkg() panics on Windows Model version is corrupted on Windows Nov 3, 2022
@ykomatsu
Copy link
Author

ykomatsu commented Nov 3, 2022

[fj 0.22.0] cargo:rerun-if-changed=C:\Users\ykoma\Repositories\Fornjot\.git\HEAD
[fj 0.22.0] cargo:rerun-if-changed=C:\Users\ykoma\Repositories\Fornjot\.git\refs/heads/main
[fj 0.22.0] cargo:rerun-if-env-changed=FJ_OFFICIAL_RELEASE
[fj 0.22.0] cargo:rustc-env=FJ_VERSION_PKG=0.22.0
[fj 0.22.0] cargo:rustc-env=FJ_VERSION_FULL=0.22.0 (eae922bd, unreleased)
[fj 0.22.0] cargo:rerun-if-changed=Cargo.toml

FJ_VERSION_PKG seems to be set correctly.

@hannobraun
Copy link
Owner

Indeed, that looks correct. Thank you!

I'll think some more about this and will get back to you once I have something else to test. If you want to look into this in the meantime, maybe sprinkling some dbg!s in fj::version will lead to new insights.

@hannobraun
Copy link
Owner

I wonder if the linker removes the static with the version (VERSION_PKG) from the model library for some reason. I don't think it should, but if it does, then ptr would point to the wrong address (as you suspected), which would maybe explain the wrong version.

Could you try the following:

  1. Run a model from the repository. Just execute cargo run at the repository root.
  2. I'd expect that would lead to the error you posted. Quit the application.
  3. This should have created a .dll and possibly a .lib file in the target/ directory. On Linux I get a target/debug/libtest.so, for example, and I expect you'd get a .dll and possible a .lib in the same location on Windows.
  4. List the symbols that are exported from the .lib and/or .dll. According to this StackOverflow answer this should work using DUMPBIN /EXPORTS.

I'm not very familiar with Windows, but I expect that something along those lines should work.

If the linker really removes the symbol, I guess we should get an error earlier in the process, when loading the symbol. But this might still be worth a try.

@hannobraun
Copy link
Owner

@ykomatsu I've merged a pull request (#1308) that should fix all immediate issues, by simply disabling the version check on Windows. I'd appreciate if you could give it a try!

I've opened #1307 as follow-up issue.

@ykomatsu
Copy link
Author

ykomatsu commented Nov 4, 2022

Thanks!
It works fine.

fornjot-version-2

dumpbin /exports output:

C:\Users\ykoma\Repositories\Fornjot>dumpbin /exports target\debug\test.dll
Microsoft (R) COFF/PE Dumper Version 14.33.31630.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file target\debug\test.dll

File Type: DLL

  Section contains the following exports for test.dll

    00000000 characteristics
    FFFFFFFF time date stamp
        0.00 version
           1 ordinal base
           3 number of functions
           3 number of names

    ordinal hint RVA      name

          1    0 00004370 fj_model_init = fj_model_init
          2    1 00007CB0 version_full = version_full
          3    2 00007C50 version_pkg = version_pkg

  Summary

        1000 .data
        3000 .pdata
        C000 .rdata
        1000 .reloc
       2E000 .text

@hannobraun
Copy link
Owner

Thank you for confirming, @ykomatsu!

I think your dumpbin output is interesting, as it shows the functions, but not the statics! I don't know if it just omits those (maybe different arguments are required to show them), or if they're not present in the library. If they're not present, that explains why the version is garbage (and confirms your suspicion that ptr points at the wrong thing).

I think I'll leave it at that. Given my lack of expertise in and access to Windows, this would be hard for me to debug. The current solution should work well enough as a stopgap. I hope at some point someone will come along to fix #1307.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
2 participants