-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misaligned pointers in thread_local static variables upon creation of a HashMap #124
Comments
Alignment issues on thread local variables... where have I heard this before? Still, this looks like a pretty complex situation, most likely relating back to the TLS code in @ian-h-chamberlain since you made those changes, can you run similar tests again? It looks like nothing changed in the mean time regarding TLS in |
oh no, hopefully this isn't another case of similar shenanigans, that one took ages to track down... I am able to reproduce using the Edit: ok, so far this is the smallest reproduction I've found. I'm trying to inline all the calls there and see if I can narrow it down further but haven't been able to get it so far: fn main() {
ctru::use_panic_handler();
let _s = std::collections::hash_map::RandomState::new();
} It does seem important that the |
I just tested without |
Yeah, most likely the panic happens regardless but the application exits normally. That’s one of the nice features of our toolchain. |
Hmm, that's interesting - in my case I was also checking with the debugger (with a breakpoint on The next thing I haven't looked at yet is actually examining the thread local section of the executable, which might provide some clues why the TLS is misaligned. I've been busy for the last few weeks but should have some time to dig in this week. |
I recommend putting a breakpoint in |
Ok, I figured out that in the case where I wasn't reproducing the issue, it was because my "custom" thread-local was aligned correctly, but the one in Edit: I might be mistaken about this and it's just a coincidence, need to re-learn some stuff about TLS and ELF structure I think 😅 broken case (using ../target/armv6k-nintendo-3ds/debug/examples/thread-local-minimal.elf: file format elf32-littlearm
SYMBOL TABLE:
0023da30 l d .tbss 00000000 .tbss
00000c20 l .tbss 00000001 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::STATE
00000c1c l .tbss 00000004 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::VAL
00000c24 l .tbss 0000000c std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY
00000c31 l .tbss 00000001 std::sync::remutex::current_thread_unique_ptr::X::__getit::STATE
00000c30 l .tbss 00000001 std::sync::remutex::current_thread_unique_ptr::X::__getit::VAL
00000c32 l .tbss 00000001 std::sys_common::thread_info::THREAD_INFO::__getit::STATE
0023da30 g .tbss 00000000 __preinit_array_start
00000014 g .tbss 00000802 __ctru_dev_utf16_buf
00000818 g .tbss 00000401 __ctru_dev_path_buf
00000c34 g .tbss 00000020 std::collections::hash::map::RandomState::new::KEYS::__getit::__KEY
0023da30 g .tbss 00000000 __preinit_array_end Note that the location of When I try to inline all the code of ../target/armv6k-nintendo-3ds/debug/examples/thread-local-minimal.elf: file format elf32-littlearm
SYMBOL TABLE:
0023da2c l d .tbss 00000000 .tbss
00000018 l .tbss 00000020 thread_local_minimal::main::KEYS::__getit::__KEY
00000c44 l .tbss 00000001 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::STATE
00000c40 l .tbss 00000004 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::VAL
00000c48 l .tbss 0000000c std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY
00000c55 l .tbss 00000001 std::sync::remutex::current_thread_unique_ptr::X::__getit::STATE
00000c54 l .tbss 00000001 std::sync::remutex::current_thread_unique_ptr::X::__getit::VAL
00000c56 l .tbss 00000001 std::sys_common::thread_info::THREAD_INFO::__getit::STATE
0023da2c g .tbss 00000000 __preinit_array_start
00000038 g .tbss 00000802 __ctru_dev_utf16_buf
0000083c g .tbss 00000401 __ctru_dev_path_buf Still trying to figure out why it's begin generated with the wrong offset like that and how to fix it, but at first glance I don't think it's totally related to the original TLS alignment issues we had, but I can't be sure yet. |
Good news! I finally made some progress here and I think I've identified a solution, which will have to get delivered upstream to https://github.com/devkitPro/devkitarm-crtls I'll try to have a write-up and post a PR tomorrow, but I will definitely need to get some input from the devkitPro folks since the change I have planned could affect other users too. Will post updates here as it develops. |
Closed as result of the upstream changes in devkitPro/devkitarm-crtls#7. @ian-h-chamberlain you're incredible! The whole homebrew community has benefited from your investigations 🎉 |
sorry for reviving a dead issue but i don't know if this is worth opening a new one or if it's a weird edge-case of this issue project this is happening is https://github.com/patataofcourse/Barista (project is a big mess though and i assume stinks for reproducibility), and this error apparently only happens when compiling on certain machines, but that might have changed since last time i tried...? again if there's anything i can do lmk, and my discord is under the same username as here (patataofcourse) if anyone who can help prefers a more direct method of communication |
This issue was resolved as a change in the upstream packages released by @devkitPro. dkp-pacman -Syu If this alone doesn’t cut it, try updating and refreshing the whole env: rustup update
cargo clean
cargo update |
refreshing the whole env didn't work either, and it's not the first time i've tried to refresh it either the
|
I've tried running your project on my New 3DS XL using the latest nightly, and I can't seem to find the error you are talking about. Please give me more information for me to help you, since reproducibility seems quite hard. Also, I've found an ARM panic here: I don't know whether or not you have been able to write files to the SD card using Regardless, with a bit of code cutting I managed to get to the UI stage just fine. Is there any way to reproduce the error I could use? |
ah crap, i'm testing on Citra so the ARM panic is probably ignored by it. i'll change to std::fs but it'd be great if you guys could deprecate ctru::fs so that it being broken wouldn't have to be transmitted via word-of-mouth thank you for all the help btw it means a lot |
Yeah, I’m sorry about that. Sadly other things took priority over it (especially since the
The tests I ran were built using your own makefile (especially since it was needed to compile and use those other C libraries that I didn’t look into). |
oops i forgot that was a thing, makes sense you'd need to use the makefile then
yeah, that is very fair, however i can't always test on hardware, especially when i'm not home. usual rule of thumb in the modding community i'm in is: do quick tests on Citra, and a final test on hardware to see if anything happens there, especially since usually you have to make a bunch of consecutive quick tests in case something is slightly off. usually there's not errors that are only on Citra, usually it's the opposite, but of course that's on modded commercial games, not homebrew. either way, i'll remove every instance of ctru::fs and use std::fs instead, and I'll let you know if that fixes anything |
I've tested with Citra, no errors in sight. 🤷♂️ |
removed any ctru::fs references, still the same error
okay that's very strange. mind if i send a pre-built .3dsx and .elf in case it's something weird in my rust setup? |
As I said earlier, the most likely cause is not an outdated Rust toolchain, as it may be an outdated dkARM (or an unproperly set one?). I know you've checked the Send me the file however you want, you can also contact me on discord (username "meziu"), though I have to stop working on this for today. |
It's true, even the My panic info:
@ian-h-chamberlain I can't ask you to look into this again, since you've done more than enough and I'm sure you have other things to do, but could you please share your minimal-reproducibility tests (if you still have them) so I can try making sense of this myself? |
what was the issue that was causing my and your builds to be different? |
Ok, let's see... The first thing to test is probably the C program in #60 (comment) — that could point out if there was a regression in devkitARM. We might also be able to check some of the test cases listed here which were mentioned in KallistiOS/KallistiOS#111 Those might at least eliminate the Rust toolchain as an option if they also reproduce an issue, and possibly point to a DKP regression. If not, then it sounds like the hashmap example is another good place to start. This tiny program was enough to reproduce before: #124 (comment) A lot of what I did to debug previously was dumping the addresses of various symbols to see if the alignment looked correct or not, e.g. something like this: $ arm-none-eabi-readelf -W --demangle --syms target/armv6k-nintendo-3ds/debug/examples/thread-local-minimal.elf | awk 'NR == 3 || /BUF|tdata|tbss|tls|TLS/'
Symbol table '.symtab' contains 34409 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
10: 0029ebf4 0 SECTION LOCAL DEFAULT 15 .tdata
11: 0029ebf8 0 SECTION LOCAL DEFAULT 16 .tbss
1346: 00000c08 12 TLS LOCAL DEFAULT 16 pthread_3ds::thread_keys::LOCALS
3454: 00000c14 1 TLS LOCAL DEFAULT 16 std::sync::remutex::current_thread_unique_ptr::X::__getit::VAL
3455: 00000c15 1 TLS LOCAL DEFAULT 16 std::sync::remutex::current_thread_unique_ptr::X::__getit::STATE
6245: 00000c18 8 TLS LOCAL DEFAULT 16 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::VAL
6246: 00000c20 1 TLS LOCAL DEFAULT 16 std::panicking::panic_count::LOCAL_PANIC_COUNT::__getit::STATE
9991: 00000c24 12 TLS LOCAL DEFAULT 16 std::io::stdio::OUTPUT_CAPTURE::__getit::__KEY
10259: 00000c30 16 TLS LOCAL DEFAULT 16 std::sys_common::thread_info::THREAD_INFO::__getit::VAL
10260: 00000c40 1 TLS LOCAL DEFAULT 16 std::sys_common::thread_info::THREAD_INFO::__getit::STATE
14997: 00000000 0 TLS LOCAL DEFAULT 16 $d
14998: 00000804 0 TLS LOCAL DEFAULT 16 $d
15001: 00000000 0 TLS LOCAL DEFAULT 16 _TLS_MODULE_BASE_
26733: 0029ebf4 0 NOTYPE GLOBAL DEFAULT 15 __tdata_lma
27665: 00000000 2050 TLS GLOBAL DEFAULT 16 __ctru_dev_utf16_buf
27813: 002a1b58 0 NOTYPE GLOBAL DEFAULT 21 __tls_end
29308: 00000804 1025 TLS GLOBAL DEFAULT 16 __ctru_dev_path_buf
29816: 00000c48 32 TLS GLOBAL DEFAULT 16 std::collections::hash::map::RandomState::new::KEYS::__getit::__KEY
30542: 00287940 0 NOTYPE GLOBAL DEFAULT 3 __tdata_align
32757: 002a0eec 0 NOTYPE GLOBAL DEFAULT 21 __tls_start
33555: 0029ebf4 0 NOTYPE GLOBAL DEFAULT 15 __tdata_lma_end Usually Man, it's crazy how this issue keeps coming back, we can't get a break! 😭 As you alluded to @Meziu I will probably take a back seat to debugging this time around as I'd like to focus on other areas of the project, but I can try to help if others get stuck. I wonder... we should probably see if we can add some kind of test for this as part of #135 so we know if it happens again. I've been working on another push for that so I'm hoping to have something ready fairly soon! I think one of our tests builds a hashmap so maybe it would already catch this, but I think it's also kind of dependent on compiler optimizations so maybe an extra integration test or something wouldn't hurt. |
Hi there, |
Thank you for your research on the topic! I believe the ".tbss exactly at .__tls_start" was the original issue for some other quirks of this whole situation, so the conclusions don't seem that farfetched to me 😅. However, the solution you proposed would also impact the usage of the whole C toolchain provided by devkitPro, so I'm going to need to analyze the situation deeper (to make sure it's the right call) before making the proposal for this change directly to them. I would greatly appreciate if you were to delve into the problem more than you already did, but I don't want you to put your health at risk more than it seems you are doing already 😆. I will look into it when I have some time 👍. |
I couldn't help myself, and I looked a bit deeper into it. |
I've officially decided to waste my sanity trying to solve this issue, since ~Deep breath~... Ok, let's do this one more time. I can't dedicate this problem much time right now, but I've started with some basic testing that was very much needed.
So, from this first round of testing (and the research done by @Jhynjhiruu) I come to the conclusion that neither Rust nor devkitPro did anything that caused this behaviour. There is something fundamentally wrong (most likely in the whole setup handled by |
Can you give an example of a nightly version that doesn't reproduce the crash? I have a suspicion about why that I'd like to confirm. |
I’m not home, but I definitely used “2023-09-25” that worked. I can’t really tell when they stop working, since the process was taking a lot of time and I couldn’t really “binary-search” my way to the exact date. |
Agh. I hate threads.
This corresponds to this line of C code: BUF_16.inner[0] = 0; Fairly simply, this loads the address of the thread-local storage using |
Yes, I believe this is up to the compiler (or maybe actually the linker?), to allow for optimizations with alignment, packing etc. I have anecdotally found that
Yeah, when I was working on the earlier bugfix it seemed like the best way to check on the output was to actually examine the numerical value of e.g. I haven't had a chance to look at this issue or your investigation too in-depth, but offhand I think your proposed solution to the linker script seems reasonable and not too likely to break anything. If you wanted to put it to a test, some of the Dreamcast homebrew folks have made a test for similar issues that might be useful to adapt, or you can also just keep trying with the same code from that other issue. https://github.com/KallistiOS/KallistiOS/pull/111/files#diff-bcbd57e8ac3c78995f3d91366794cff6a15d98784b172f07c14538878eefd776 It definitely seems to me like wasting a few bytes in favor of correctness would be worth it, in any case. If I find some time soon I can try to look again in more detail, but it seems like you both have already done a pretty thorough investigation, and it also sounds like there is a proposed code change that might fix it, so that's good! Here's hoping this is the last bug of its kind because these have been brutal each time! |
Hopefully (🤞) that devkitPro PR will fix this issue for good. Here's hoping this issue can be closed after less than a year... |
This is only tangentially related to the issue at hand, but I wanted to ask in case of future problems: is the total size of tbss and tdata kept under 128 bytes (0x80) by the linker? I'm just wondering if there could be a case where so many thread local variables are declared in C or Rust that they overflow into the IPC area. Going even further, could devkitpro explicitly declare the IPC command areas as sections/symbols that can be exported to C/Rust code? Then we wouldn't have to manually get the TLS pointers anymore when writing IPC code. (I am a total noob when it comes to how linkers work so apologies if this is a dumb question) |
Doesn't look like it's limited in size at all. It could be if needed, I think. |
I guess I'm assuming that the compiler must be able to figure out how to emit the
I don't think they're declared anywhere, you just call I guess I should start a separate issue for that feature request somewhere, but I wanted to bring up the possibility of .tbss and .tdata overflowing the IPC buffer while the linker script stuff is still fresh in everyone's mind :) |
If anybody manages to overflow the TLS section I will not do anything to help them 😤. (jk, I’m just tired from all the thread related issues) |
|
Getting the thread pointer is done with a function defined in libctru somewhere (it's called |
According to https://www.3dbrew.org/wiki/Thread_Local_Storage, the TLS area is 0x200 bytes in size, but only the first 0x80 bytes are available for general purpose use. The IPC command area takes up the remaining space. I didn't see anything in the .ld script that @Meziu edited that references either the 0x80 or 0x200 size limits. As I said, I don't really know the details of how the linker works so maybe those limits are declared somewhere else but I can't find them. If they are not defined then I believe you could declare a large thread local object and end up overlapping with the IPC buffer. Update: I did some testing by declaring a large |
Yeah, exactly, TLS via the |
Wow, I feel so much better seeing that TLS has been as much of a PITA for you guys as it was for me in KallistiOS for the Sega Dreamcast... I've done a LOT of coding in our kernel and for our toolchains and SDK, but implementing this with verification for all of the corner-cases still sticks out as one of the roughest things I've ever done... ...but now the inner masochist in me wants to see if I can implement the fully dynamic TLS model for loading dynamic libraries at runtime that use |
A part of me feels like this isn't really a closed issue, but I digress...
Well, on the 3DS there is no "official" support for dynamic libraries (with the exception of a few OS-related cases, which aren't easily accessible through homebrew), and while there have been some efforts to dinamically load code (see 3ds_dl) the overall homebrew scene isn't interested in supporting anything that isn't statically linked. Also, SD card read speeds on the 3DS are atrociously slow, so there wouldn't be that much gain either way 😄. It does leave me to wonder how such cases might have to be handled though 🤔. |
Creating a hashmap causes the panic
misaligned pointer dereference: address must be a multiple of 0x8 but is ...
. Discovered this while trying to run Meizu's realtime-audio example Meziu/realtime-audio#1This issue and this seem to be related. They're talking about static variables being linked in un-aligned ways. It's a bit over my head but it might be useful context.
I tracked it to the Symphonia library creating a HashMap.
HashMap::new
initializesRandomState
with a static variable insidethread_local!
. This is where the crash happens:/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/collections/hash/map.rs:3127
The minimal test case appears to be "creating a hashmap". Can someone else please run
examples/hashmaps.rs
in debug and confirm that it panics?The text was updated successfully, but these errors were encountered: