Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page Fault Error Code not properly read #513

Closed
tscs37 opened this issue Jan 1, 2019 · 8 comments
Closed

Page Fault Error Code not properly read #513

tscs37 opened this issue Jan 1, 2019 · 8 comments

Comments

@tscs37
Copy link

tscs37 commented Jan 1, 2019

During development I was encountering strange error codes set for the page fault handler.

After some experimenting and feedback with other developers it turns out that LLVM doesn't seem to properly handle the x86-interrupt ABI; the return address (RIP) is read instead of the actual error code, causing some rather silly bugs.

To fix this I made the following adjustments to my own code:

        let page_fault: 
            extern "x86-interrupt" fn(&mut ExceptionStackFrame, u64)
            = page_fault;
        let page_fault: 
            extern "x86-interrupt" fn(&mut ExceptionStackFrame, PageFaultErrorCode)
            = unsafe{core::mem::transmute(page_fault)};
        unsafe{idt.page_fault
            .set_handler_fn(page_fault)
            .set_stack_index(INTR_IST_INDEX)};
extern "x86-interrupt" fn page_fault(
    stack_frame: &mut ExceptionStackFrame,
    error_code: u64,
) {
    // adjust stack and clone out the real error code
    unsafe{asm!("sub rsp, 8
    sub rbp, 8"::::"intel", "volatile")};
    let error_code = PageFaultErrorCode::from_bits(error_code)
        .expect(&format!("error_code has reserved bits set: {:#018x}", error_code))
        .clone();
    unsafe{asm!("add rsp, 8
    add rbp, 8"::::"intel", "volatile")};

With this fix makes the stack_frame variable accessible; trying to read values causes a GPF for me, so it's half of a fix.

Using the original extern "x86-interrupt" fn(&mut ExceptionStackFrame, PageFaultErrorCode) function signature doesn't seem to work by default, ie, without adjusting the stack but I can't seem to get rust to properly clone the PFE Code without causing another GPF.

However, the interrupt functions correctly despite this and calling and referencing kernel related functions and data works without issue.

From what I can gather this is an issue with either the Rust compiler or LLVM, this should possibly be mentioned on the page for CPU Exceptions.

This occurs on Rust nightly-2018-12-30 on Linux and QEMU with and without KVM.

@phil-opp
Copy link
Owner

phil-opp commented Jan 2, 2019

Thanks a lot for reporting! I can reproduce it. The problem seems to be in LLVM, since the generated assembly is faulty:

[…]
mov    rax,QWORD PTR [rsp+0x70]
lea    rcx,[rsp+0x70]
mov    QWORD PTR [rsp+0x10],rcx
mov    rdi,QWORD PTR [rsp+0x10]
mov    rsi,QWORD PTR [rsp+0x70]
mov    QWORD PTR [rsp+0x8],rax
call   205170 <foo>

Foo is a test function that takes the stack frame and the error code as arguments in rdi and rsi registers. The rsi register is loaded with the instruction pointer field of the stack frame ([rsp+0x70]), instead of the correct [rsp+0x68]. This seems to be a bug in LLVM. I try to investigate more.

Interestingly it is correct in --release mode (with optimizations):

mov    rsi,QWORD PTR [rsp+0x58]
lea    rdi,[rsp+0x60]
call   203a70 <foo>

I printed the error code and it looks correct too. Does it work in release mode for you too?

@Restioson
Copy link

You know, I did always think that sometimes the error flags looked a little off in flower...

@phil-opp
Copy link
Owner

phil-opp commented Jan 2, 2019

I took a quick look at the LLVM code, but I fear that I don't know enough of LLVM to be able to solve this. I opened rust-lang/rust#57270 for this.

@tscs37
Copy link
Author

tscs37 commented Jan 3, 2019

Hi, checking --release mode, the error indeed goes away as far as I can tell, though I need debug mode for other reasons atm :(

@phil-opp
Copy link
Owner

phil-opp commented Jan 3, 2019

Thanks for checking! I'm currently working on a fix that also works in debug mode.

@Richard-W
Copy link

Richard-W commented Jan 3, 2019

I am able to reproduce this issue in Release mode:

#[derive(Debug, Clone, Copy)]
#[repr(packed)]
struct StackFrame {
    pub ip: u64,
    pub cs: u64,
    pub flags: u64,
    pub sp: u64,
    pub ss: u64
}

extern "x86-interrupt" fn isr_0x0e(frame: &StackFrame, error_code: u64) {
    panic!("Exception #PF ({:x}): {:#x?}", error_code, frame);
}

Relevant assembly for isr_0x0e:

// 0xc0(%rsp) is the correct address of the stack frame.
// error code is at 0xb8(%rsp)
lea 0xc0(%rsp),%rax
mov %rax,0x10(%rsp)
lea 0xc0(%rsp),%rax
mov %rax,0x18(%rsp)

still outputs %rip as the error code on a PF exception even if I am compiling using "cargo xbuild --release". I am using the latest nightly build of rust.

@phil-opp
Copy link
Owner

phil-opp commented Jan 3, 2019

@Richard-W Thanks! I think I found the issue in LLVM and I'm currently preparing a patch for it.

@phil-opp
Copy link
Owner

The issue should be fixed on current nightlies.

phil-opp added a commit that referenced this issue Jul 22, 2019
phil-opp added a commit that referenced this issue Jul 22, 2019
We previously did not use the error code because of #513, which is now fixed.
phil-opp added a commit that referenced this issue Jul 22, 2019
We previously did not use the error code because of #513, which is now fixed.
phil-opp added a commit that referenced this issue Jul 22, 2019
phil-opp added a commit that referenced this issue Jul 22, 2019
We previously did not use the error code because of #513, which is now fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants