Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse abort in Modernish under virtualization #5

Closed
mgree opened this issue Jun 12, 2019 · 7 comments · Fixed by binpash/libdash#1
Closed

Parse abort in Modernish under virtualization #5

mgree opened this issue Jun 12, 2019 · 7 comments · Fixed by binpash/libdash#1
Labels
bug Something isn't working

Comments

@mgree
Copy link
Owner

mgree commented Jun 12, 2019

Running ./install -s smoosh in Docker yields a parse abort. It's probably related to the other parse aborts we've seen, which I suspect is a mishandling of erroneous parsing in eval.

@mgree mgree added the bug Something isn't working label Jun 12, 2019
@mgree
Copy link
Owner Author

mgree commented Oct 21, 2019

*** Error in `/bin/smoosh': free(): invalid pointer: 0x00007fde9e47d160 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7fde9d819bfb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fc6)[0x7fde9d81ffc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7780e)[0x7fde9d82080e]
/usr/lib/x86_64-linux-gnu/libdash.so.0(popstackmark+0x33)[0x7fde9e26a4b3]
/usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c)[0x7fde9e056038]
/usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x32a)[0x7fde9e055a9a]
/bin/smoosh(ctypes_call+0x379)[0x5589bbe03ca9]
/bin/smoosh(camlCtypes_ffi__fun_502550+0x80)[0x5589bbd9bcf0]
/bin/smoosh(camlShim__parse_next_302762+0x9f)[0x5589bbd4ba3f]
/bin/smoosh(camlSmoosh__parse_next_3918+0x19)[0x5589bbd30fc9]
/bin/smoosh(camlSemantics__step_eval_301491+0x4835)[0x5589bbd0d385]
/bin/smoosh(camlSemantics__step_eval_301491+0x5a09)[0x5589bbd0e559]
/bin/smoosh(camlSemantics__step_eval_301491+0x58f5)[0x5589bbd0e445]
/bin/smoosh(camlSemantics__full_evaluation_301492+0x8e)[0x5589bbd0f0fe]
/bin/smoosh(camlSemantics__fun_2205374+0x2a)[0x5589bbd0f8ca]
/bin/smoosh(camlSystem__real_fork_and_eval_1002385+0x268)[0x5589bbd37c48]
/bin/smoosh(camlOs_system__fun_303665+0x160)[0x5589bbd35760]
/bin/smoosh(camlSemantics__step_eval_301491+0x1c28)[0x5589bbd0a778]
/bin/smoosh(camlSemantics__step_eval_301491+0x1e5c)[0x5589bbd0a9ac]
/bin/smoosh(camlSemantics__step_eval_301491+0x4ca6)[0x5589bbd0d7f6]
/bin/smoosh(camlSemantics__step_eval_301491+0x5a09)[0x5589bbd0e559]
/bin/smoosh(camlSemantics__step_eval_301491+0x4382)[0x5589bbd0ced2]
/bin/smoosh(camlSemantics__step_eval_301491+0x2529)[0x5589bbd0b079]
/bin/smoosh(camlSemantics__step_eval_301491+0x2529)[0x5589bbd0b079]
/bin/smoosh(camlSemantics__step_eval_301491+0x2529)[0x5589bbd0b079]
/bin/smoosh(camlSemantics__step_eval_301491+0x2eb6)[0x5589bbd0ba06]
/bin/smoosh(camlSemantics__step_eval_301491+0x4382)[0x5589bbd0ced2]
/bin/smoosh(camlSemantics__step_eval_301491+0x273d)[0x5589bbd0b28d]
/bin/smoosh(camlSemantics__step_eval_301491+0x4ca6)[0x5589bbd0d7f6]
/bin/smoosh(camlSemantics__step_eval_301491+0x4382)[0x5589bbd0ced2]
/bin/smoosh(camlSemantics__step_eval_301491+0x2529)[0x5589bbd0b079]
/bin/smoosh(camlSemantics__step_eval_301491+0x5a09)[0x5589bbd0e559]
/bin/smoosh(camlSemantics__step_eval_301491+0x4382)[0x5589bbd0ced2]
/bin/smoosh(camlSemantics__step_eval_301491+0x4ca6)[0x5589bbd0d7f6]
/bin/smoosh(camlSemantics__full_evaluation_301492+0x8e)[0x5589bbd0f0fe]
/bin/smoosh(camlShell__cmdloop_703727+0xa3)[0x5589bbd03ef3]
/bin/smoosh(camlShell__entry+0x2fe)[0x5589bbd0453e]
/bin/smoosh(caml_program+0x849)[0x5589bbcff519]
/bin/smoosh(+0x291190)[0x5589bbe2a190]
/bin/smoosh(caml_startup_common+0x215)[0x5589bbe2a525]
/bin/smoosh(caml_startup+0xb)[0x5589bbe2a57b]
/bin/smoosh(main+0xc)[0x5589bbcfeb8c]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fde9d7c92e1]
/bin/smoosh(_start+0x2a)[0x5589bbcfebca]
======= Memory map: ========
5589bbb99000-5589bbe7c000 r-xp 00000000 08:03 539006                     /bin/smoosh
5589bc07c000-5589bc07d000 r--p 002e3000 08:03 539006                     /bin/smoosh
5589bc07d000-5589bc155000 rw-p 002e4000 08:03 539006                     /bin/smoosh
5589bc155000-5589bc16b000 rw-p 00000000 00:00 0 
5589bd160000-5589bd1f7000 rw-p 00000000 00:00 0                          [heap]
7fde98000000-7fde98021000 rw-p 00000000 00:00 0 
7fde98021000-7fde9c000000 ---p 00000000 00:00 0 
7fde9c28d000-7fde9c2a3000 r-xp 00000000 08:03 915716                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7fde9c2a3000-7fde9c4a2000 ---p 00016000 08:03 915716                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7fde9c4a2000-7fde9c4a3000 r--p 00015000 08:03 915716                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7fde9c4a3000-7fde9c4a4000 rw-p 00016000 08:03 915716                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7fde9c4a4000-7fde9d7a9000 rw-p 00000000 00:00 0 
7fde9d7a9000-7fde9d93e000 r-xp 00000000 08:03 915720                     /lib/x86_64-linux-gnu/libc-2.24.so
7fde9d93e000-7fde9db3e000 ---p 00195000 08:03 915720                     /lib/x86_64-linux-gnu/libc-2.24.so
7fde9db3e000-7fde9db42000 r--p 00195000 08:03 915720                     /lib/x86_64-linux-gnu/libc-2.24.so
7fde9db42000-7fde9db44000 rw-p 00199000 08:03 915720                     /lib/x86_64-linux-gnu/libc-2.24.so
7fde9db44000-7fde9db48000 rw-p 00000000 00:00 0 
7fde9db48000-7fde9db4b000 r-xp 00000000 08:03 915723                     /lib/x86_64-linux-gnu/libdl-2.24.so
7fde9db4b000-7fde9dd4a000 ---p 00003000 08:03 915723                     /lib/x86_64-linux-gnu/libdl-2.24.so
7fde9dd4a000-7fde9dd4b000 r--p 00002000 08:03 915723                     /lib/x86_64-linux-gnu/libdl-2.24.so
7fde9dd4b000-7fde9dd4c000 rw-p 00003000 08:03 915723                     /lib/x86_64-linux-gnu/libdl-2.24.so
7fde9dd4c000-7fde9de4f000 r-xp 00000000 08:03 915724                     /lib/x86_64-linux-gnu/libm-2.24.so
7fde9de4f000-7fde9e04e000 ---p 00103000 08:03 915724                     /lib/x86_64-linux-gnu/libm-2.24.so
7fde9e04e000-7fde9e04f000 r--p 00102000 08:03 915724                     /lib/x86_64-linux-gnu/libm-2.24.so
7fde9e04f000-7fde9e050000 rw-p 00103000 08:03 915724                     /lib/x86_64-linux-gnu/libm-2.24.so
7fde9e050000-7fde9e057000 r-xp 00000000 08:03 266053                     /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
7fde9e057000-7fde9e257000 ---p 00007000 08:03 266053                     /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
7fde9e257000-7fde9e258000 r--p 00007000 08:03 266053                     /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
7fde9e258000-7fde9e259000 rw-p 00008000 08:03 266053                     /usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
7fde9e259000-7fde9e279000 r-xp 00000000 08:03 285411                     /usr/lib/x86_64-linux-gnu/libdash.so.0.0.0
7fde9e279000-7fde9e478000 ---p 00020000 08:03 285411                     /usr/lib/x86_64-linux-gnu/libdash.so.0.0.0
7fde9e478000-7fde9e47a000 r--p 0001f000 08:03 285411                     /usr/lib/x86_64-linux-gnu/libdash.so.0.0.0
7fde9e47a000-7fde9e47b000 rw-p 00021000 08:03 285411                     /usr/lib/x86_64-linux-gnu/libdash.so.0.0.0
7fde9e47b000-7fde9e47e000 rw-p 00000000 00:00 0 
7fde9e47e000-7fde9e4a1000 r-xp 00000000 08:03 915715                     /lib/x86_64-linux-gnu/ld-2.24.so
7fde9e4be000-7fde9e698000 rw-p 00000000 00:00 0 
7fde9e6a0000-7fde9e6a1000 rw-p 00000000 00:00 0 
7fde9e6a1000-7fde9e6a2000 r--p 00023000 08:03 915715                     /lib/x86_64-linux-gnu/ld-2.24.so
7fde9e6a2000-7fde9e6a3000 rw-p 00024000 08:03 915715                     /lib/x86_64-linux-gnu/ld-2.24.so
7fde9e6a3000-7fde9e6a4000 rw-p 00000000 00:00 0 
7ffd81103000-7ffd81146000 rw-p 00000000 00:00 0                          [stack]
7ffd81165000-7ffd81167000 r--p 00000000 00:00 0                          [vvar]
7ffd81167000-7ffd81169000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

From running Modernish in a Vagrant VM.

@mgree
Copy link
Owner Author

mgree commented Oct 21, 2019

NB that it seems to always be the exact same failure, i.e., a bad call to ckfree on 0x00007f4ff7f9d160. Still no luck figuring out what the exact trigger is. I suspect this is some kind of stack underflow, i.e., we're calling popstackmark too much.

(Why is it okay on native... built without some stack protections? Virtualization catches memory errors?)

@mgree mgree changed the title Parse abort in modernish on Docker Parse abort in Modernish under virtualization Oct 21, 2019
@tucak
Copy link
Contributor

tucak commented Dec 19, 2019

I'm running it on native GNU/Linux and the crash happens even though the install script does not report it. I had to run smoosh modernish/bin/modernish --test to observe it.

Smallest and simplest reproducer I found, which also happens to be a close approximation of what went through my head trying to debug this:

eval '"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'

Both Shim.parse_next and Dash.parse_next pops the stack-mark when a parse error happens, maybe something goes wrong around there? I don't really understand how these stack-marks are supposed to work. Removing the pop from Dash.parse_next makes the crash go away, but I suspect that's not the correct way to fix it.

@mgree
Copy link
Owner Author

mgree commented Dec 19, 2019

lol! Thanks for finding a reproducer for this---I've been trying to find the root cause of this bug for some time, and this is a helpful lead. That reproducer closely matches my feelings about this bug, too.

The stackmarks are allocation lists rooted in the C call stack. I suspect that the core bug is that allocating those stackmarks in OCaml breaks some invariants, since GC will eventually relocate them. (But the crashes are predictable enough that I'm not 100% confident that's what's happening.) In any case, popping should be idempotent: popping means "unroll all allocations up to here". Maybe a double free is corrupting things?

@mgree
Copy link
Owner Author

mgree commented Dec 19, 2019

Adding: looking back at my stack trace above, it does seem like either a double free or a moved pointer... so the correct fix is most likely to avoid the double pop in the two parse_next functions. I'll take a closer look soon. (He said right before Christmas, New Year's, POPL, and a new semester... :/)

@tucak
Copy link
Contributor

tucak commented Dec 30, 2019

On error, Dash.parse_next pops the allocation stack back to its base. Next, Shim.parse_next tries to pop back to it had marked, but as it is no longer on the stack, it ends up trying to free the base of the stack, which is not on the heap, leading to the error.

It looks like this:

==popstackmark==
Pop to: 0x7ffff7fcb160 <stackbase>
Stack:
0x55555596ac00
0x55555596a730
0x7ffff7fcb160 <stackbase>
Caller: camlDash__parse_next_inner_202892

==popstackmark==
Pop to: 0x55555596a730
Stack:
0x7ffff7fcb160 <stackbase>
Caller: camlShim__parse_next_302764
free(): invalid pointer

If the reproducer is even a single character shorter, it no longer crashes as the input fits into the stackbase. In this case both Dash.parse_next and Shim.parse_next will pop to the base, which is perfectly okay.

The pop in Dash.parse_next is from this commit: binpash/libdash@c0ed16b. Do you remember why it was necessary to add it?

@mgree
Copy link
Owner Author

mgree commented Dec 30, 2019

Ahhhhhhh! (To paraphrase your reproducer.) Yes, I put it in as a "surely we can just pop all the back to the root stackmark on every parse", not thinking about reentrancy. I think you're right: your reproducer must be just big enough to force some allocations to exist past the root.

I really appreciate you pushing away on this! It seems like the right choice here is to never pop in Dash.parse_next... that's what you've implemented, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants