-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#2144 : single step on jump #2295
Conversation
The fix is not perfect. I think it should change the exception address to the jump destination and not the address of the jump instruction. But I do not know how to get this value. |
I wonder if you could paste the relevant pieces from loglevel 4 into the issue tracker to make sure we understand what's happening here? I put a request there. |
core/win32/callback.c
Outdated
* i#2144 : We exclude single step cases in a jump. | ||
*/ | ||
if (pExcptRec->ExceptionCode == EXCEPTION_SINGLE_STEP && | ||
!in_fcache(pExcptRec->ExceptionAddress)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment below is now in the wrong place, about calling in_fcache last.
Logically this would be better out in the caller and not here in check_internal_exception which is just supposed to handle a crash in DR itself.
Probably what is needed is a query in_exit_stub().
core/win32/callback.c
Outdated
!in_fcache(pExcptRec->ExceptionAddress)) { | ||
fragment_t wrapper; | ||
linkstub_t *l; | ||
fragment_t *f = fragment_pclookup(dcontext, dcontext->next_tag, &wrapper); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next_tag is only set on entering the cache. This could have run thousands of blocks since then, so next_tag is irrelevant at this point.
It seems that to handle this we need to decode the stub and extract the linkstub pointer. We have existing stub decoding for coarse stubs today but not for fine.
core/win32/callback.c
Outdated
fragment_t wrapper; | ||
linkstub_t *l; | ||
fragment_t *f = fragment_pclookup(dcontext, dcontext->next_tag, &wrapper); | ||
l = FRAGMENT_EXIT_STUBS(f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no check for pclookup failure.
An alternative could be to insert a nop during mangling after everything that could trigger this: just popf and iret? |
Maybe your alternative is good. |
You mean, things like execution under a debugger?
We would arrange to have the translation do the right thing. |
That is a good example. |
So, what should be done ? First solution seems easier, but maybe not best for transparency (ExceptionAddress) |
It is up to you. Certainly putting a nop after the couple of explicit flag writes is the easiest thing to implement and should cover most cases here, whille implementing stub translation is a bigger project. Execution under a debugger already hits cases we just can't solve from user space: e.g., on Windows, OutputDebugString can cause us to lose control under a debugger. So bailing on handling this under a debugger for now does not seem terrible. Something like setting a thread's context to trigger this seems like more of a problem. We could at least add a check for that which complains even if it doesn't handle it. |
Okay. Should I dot it in mangle function ? something like
|
Yes, sthg like mangle() asking if (instr_can_set_single_step()) mangle_single_step()? |
For iret, I did not know what to do. |
We split iret up into its pieces and could put a nop after the popf. |
I guess that means we need interactions between iret mangling and this single-step mangling. |
core/arch/x86/mangle.c
Outdated
* FIXME i#2144 : to be absolutely transparent, we should translate the | ||
* exception address as if we did not insert this nop. | ||
*/ | ||
POST(ilist, instr, INSTR_CREATE_nop(dcontext)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the instrlist_set_translation_target() is probably automagically giving this a translation target but it's equal to the popf, which is not what we want, right? Also, when translating the single-step exception that points at the nop, won't translate_walk_track() consider this "unsupported_mangle" and thus translate_walk_good_state() will return false and we'll get an assert?
So we need two things: one, set the translation to the next app pc; and two, recognize the nop during translation.
xor eax, eax | ||
/* push flags on the stack */ | ||
#ifdef X64 | ||
pushfq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a cpp layer (see asm_defines.asm) to avoid needing ifdefs: use PUSHF, PTRSZ, and REG_XSP.
inc eax | ||
/* popping flags from top of the stack */ | ||
#ifdef X64 | ||
popfq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use POPF and no ifdef is needed
I hope this fix is ok. |
I did not manage to reproduce the appveyor 64bit test failure. The 64 bit test build went fine by me. |
In facts, the 64-bit test works when running with my client, and fails without it (because mangling does not insert the nop)... No idea why... |
Both 32-bit and 64-bit singlestep tests are crashing on a STATUS_SINGLE_STEP exception:
|
I'm not sure what jump your'e referring to? With the nop, isn't the STATUS_SINGLE_STEP pointing at the nop, which is in_fcache and thus gets translated (hence my prior comments about properly translating the nop)? There's no jump involved, right? |
Keep forgetting the details here: ok, so you're saying that even once the mange-inserted nop is translated to the next app instr, when that's a jump, the exception address is supposed to go further and point at the jump target. I guess that weird behavior is limited to STATUS_SINGLE_STEP and we'd handle that in callback.c? |
core/arch/x86/mangle.c
Outdated
@@ -2403,7 +2405,17 @@ mangle_single_step(dcontext_t *dcontext, instrlist_t *ilist, instr_t *instr) | |||
* FIXME i#2144 : to be absolutely transparent, we should translate the | |||
* exception address as if we did not insert this nop. | |||
*/ | |||
POST(ilist, instr, INSTR_CREATE_nop(dcontext)); | |||
instr_t* next = instr_get_next_app(instr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: t *n
core/arch/x86/mangle.c
Outdated
@@ -2403,7 +2405,17 @@ mangle_single_step(dcontext_t *dcontext, instrlist_t *ilist, instr_t *instr) | |||
* FIXME i#2144 : to be absolutely transparent, we should translate the | |||
* exception address as if we did not insert this nop. | |||
*/ | |||
POST(ilist, instr, INSTR_CREATE_nop(dcontext)); | |||
instr_t* next = instr_get_next_app(instr); | |||
/* Assumes popf cannot end a basic block. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a valid assumption as blocks are truncated prior to a branch for various reasons. Can we just have NULL be another condition.
core/arch/x86/mangle.c
Outdated
@@ -2403,7 +2405,17 @@ mangle_single_step(dcontext_t *dcontext, instrlist_t *ilist, instr_t *instr) | |||
* FIXME i#2144 : to be absolutely transparent, we should translate the | |||
* exception address as if we did not insert this nop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we already doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we should get the jump target, not the jump instruction address...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the jump target instead of the jump is unique to STATUS_SINGLE_STEP, right? Nothing else does that. So the code to do that shouldn't be in general translate.c but instead in callback.c where single step is handled. IMHO it should be part of this CL: it's just a couple of lines of code and without it the tests fail, don't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the jump target instead of the jump is unique to STATUS_SINGLE_STEP, right?
Yes to the best of my knowledge.
The tests I added will not fail because my fault handler does not check the exception address (just the exception code)
I will try to put this check before check_internal_exception
core/arch/x86/mangle.c
Outdated
* because the ExceptionAddress should be the jump target. | ||
*/ | ||
if (instr_is_cti(next)) { | ||
/* Next app instruction gives a correct translation to this nop. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't instrlist_set_translation be set to instr and not next?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in my tests although I am not fully aware of instrlist_set_translation code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mangle_shared.c:865:
app_pc xl8 = get_app_instr_xl8(instr);
instrlist_set_translation_target(ilist, xl8);
It doesn't matter where the nop is inserted: its xl8 will be set to ilist->translation_target, which is the xl8 of the instr being mangled, which is the popf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I understand.
But the ExceptionAddress is set to the EIP value after execution of a single instruction.
So, the ExceptionAddress we get is the EIP value after the no we insert, i.e. the jump address.
Considering that, I think it is ok to leave the translation address of the no to the popf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow -- isn't it like this: we put into the code cache "popf; nop". The popf writes to TF which generates the exception. The exception points at the next instruction from the one that generated it, so it points at the nop. Right? Then we process the exception and we translate it to the popf in the original app code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The popf writes to TF
yes
which generates the exception.
no : the next instruction after popf does
The exception points at the next instruction from the one that generated it
yes
so it points at the nop. Right?
no, it points at the jump
Then we process the exception and we translate it to the popf in the original app code.
no we translate it to the jump but we should translate it to the address after the jump
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Right. I also looked at the manual and now I believe I finally have all the details. Sorry, I should have thought this through earlier. Unfortunately this nop solution is not going to work for the general case. We can make it work for "popf;jmp" or "popf;jcc" with no client, because we can emulate the jmp or jcc, but the presence of instrumentation means we need a different solution as we can't easily emulate arbitrary sequences of instructions. I believe we have to disable and later forge the single-step exception. I will add something to the issue tracker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the discussion to #2144 (comment) where I try to describe the problems (instru or mangling between popf and next app instr) that require a larger solution.
Exactly |
I think that I understand why there is a crash with no client (and success with a client). It looks like the loop over instructions in Did I point out something here ? |
With no client, DR uses IR Level 0 bundles of instructions for non-"interesting" instructions: with no client, it doesn't care about non-branches in general. The mangle loop skips instrs with no opcode. You want to update decode_cti() -- probably next to the decode of iret. |
Thanks, that was exactly what I was looking for I moved the FIXME comment to callback.c |
Keeping special exit even after client processing. Restoring default exit reason to self mdoficiation. Going to right return address after iret instead of next one.
I'm afraid that the mixedmode_late failure is from this CL -- I filed as #1421 but it seems to be consistently happening here and only here. Does that test pass locally for you? (Not sure but it may only be enabled in a full suite run so sthg like "ctest" in one build dir is not sufficient.) |
The mixedmode.exe does not work on my windows 7 (without dynamorio) (I get the popup asking to close this program because of a failure) |
From the appveyor log, I gather that the problem comes from an added instruction by this CL (there is a popf in mixedmode) |
Can you provide me a log for the error with mixedmode ? I do not manage to run the test by myself. |
OK, I got the log and here is the problem : |
Test now works by me. |
core/arch/x86/decode_fast.c
Outdated
resolve_variable_size_dc(dcontext, 0, OPSZ_ret, false))); | ||
instr_set_dst(instr, 0, opnd_create_reg(REG_XSP)); | ||
(stack_sized_reg, REG_NULL, 0, 0, | ||
resolve_variable_size_dc(dcontext, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to pass prefixes here, not 0, to handle a data-size prefix on popf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this ? even if we cannot get this far with prefixes > 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you say "we cannot get this far with prefixes > 0"? Maybe you could add to your test a case with the data size prefix to make sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because before my patched code, there is
if (prefixes > 0) {
/* prefixes are rare on ctis
* rather than handle them all here, just do full decode
* FIXME: if we start to see more and more jcc branch hints we
* may change our minds here! This is case 211206/6749.
*/
if (decode(dcontext, start_pc, instr) == NULL)
return NULL;
else
return (start_pc + sz);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so it goes to the slow path. It doesn't have to for this case but it's not worth reordering. It still seems safer to pass prefixes here rather than 0 to future-proof. It would also be nice to have a test w/ a size prefix but there is a long list of "nice to have" things in the tests so this seems sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update to master and merge.
Appveyor failure is api.detach #2246. |
There are two problems with what was checked in: style violations and worse a crash, documented at #2144 (comment) |
Changing exception address when a single step exception occurs when
it happens in the stub.
Fixes i#2144