Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JavaHeader address as ObjectReference #179

Merged
merged 1 commit into from
Sep 3, 2024

Conversation

wks
Copy link
Collaborator

@wks wks commented Sep 2, 2024

This PR changes the definition of MMTk-level ObjectReference for the JikesRVM binding so that it now points to the JavaHeader, and is different from the JikesRVM-level ObjectReference (a.k.a. JikesObj). This will guarantee that the MMTk-level ObjectReference is always inside an object.

Note that this PR does not involve a change in mmtk-core. It changes ObjectModel::IN_OBJECT_ADDRESS_OFFSET to 0 so that the "in-object address" is identical to the raw address of ObjectReference. It demonstrates the JikesRVM binding can work well with MMTk-level ObjectReference being different from JikesRVM-level ObjectReference.

Related issues and PRs.

This will guarantee that the MMTk-level ObjectReference is always inside
an object.
@wks wks force-pushed the feature/objref-in-object branch from 36cc6ec to d6565a2 Compare September 3, 2024 08:09
@wks wks marked this pull request as ready for review September 3, 2024 08:11
@wks wks requested a review from qinsoon September 3, 2024 08:11
@wks
Copy link
Collaborator Author

wks commented Sep 3, 2024

Like in #177, I ran lusearch from DaCapo 2006 locall, with RFastAdaptiveSemiSpace, and see no obvious performance difference.

One strange thing is, from the disassembly, API functions which converts JikesObj to ObjectReference now involves an additional check for jikes_obj - JAVA_HEADER_OFFSET == 0.

000cfd60 <post_alloc>:
   cfd60:       55                      push   ebp
   cfd61:       89 e5                   mov    ebp,esp
   cfd63:       53                      push   ebx
   cfd64:       57                      push   edi
   cfd65:       56                      push   esi
   cfd66:       83 ec 0c                sub    esp,0xc
   cfd69:       8b 75 0c                mov    esi,DWORD PTR [ebp+0xc]
   cfd6c:       e8 00 00 00 00          call   cfd71 <post_alloc+0x11>
   cfd71:       5b                      pop    ebx
   cfd72:       81 c3 53 a1 1a 00       add    ebx,0x1aa153
   cfd78:       85 f6                   test   esi,esi ; Check if the `refer: JikesObj` argument is null
   cfd7a:       74 3c                   je     cfdb8 <post_alloc+0x58> ; If null, raise `unwrap_failed`
   cfd7c:       83 c6 f4                add    esi,0xfffffff4 ; Subtract esi by 12 (JAVA_HEADER_OFFSET)
   cfd7f:       74 37                   je     cfdb8 <post_alloc+0x58> ; If null, raise `unwrap_failed`, too.
   cfd81:       8b 4d 08                mov    ecx,DWORD PTR [ebp+0x8]
   cfd84:       8b 45 18                mov    eax,DWORD PTR [ebp+0x18]
   cfd87:       8b b9 44 01 00 00       mov    edi,DWORD PTR [ecx+0x144]
   cfd8d:       0f b6 14 47             movzx  edx,BYTE PTR [edi+eax*2]
   cfd91:       0f b6 44 47 01          movzx  eax,BYTE PTR [edi+eax*2+0x1]
   cfd96:       83 ec 0c                sub    esp,0xc
   cfd99:       50                      push   eax
   cfd9a:       e8 b1 6a ff ff          call   c6850 <_ZN4mmtk4util5alloc10allocators20Allocators$LT$VM$GT$17get_allocator_mut17hcdd4715f4b2ac9d8E>
; ...
   cfdb8:       83 ec 04                sub    esp,0x4
   cfdbb:       8d 83 1c c4 fe ff       lea    eax,[ebx-0x13be4]
   cfdc1:       8d b3 d4 c2 fe ff       lea    esi,[ebx-0x13d2c]
   cfdc7:       8d 7d f3                lea    edi,[ebp-0xd]
   cfdca:       8d 8b 49 88 f9 ff       lea    ecx,[ebx-0x677b7]
   cfdd0:       ba 2b 00 00 00          mov    edx,0x2b
   cfdd5:       50                      push   eax
   cfdd6:       56                      push   esi
   cfdd7:       57                      push   edi
   cfdd8:       e8 43 24 f5 ff          call   22220 <_ZN4core6result13unwrap_failed17h5db3a594ed51825fE>
   cfddd:       66 90                   xchg   ax,ax
   cfddf:       90                      nop

The TryFrom trait is implemented as following:

impl TryFrom<JikesObj> for ObjectReference {
    type Error = NullRefError;

    fn try_from(value: JikesObj) -> Result<Self, Self::Error> {
        if value.is_null() {
            Err(NullRefError)
        } else {
            let objref_addr = value.0.offset(JAVA_HEADER_OFFSET);
            debug_assert!(!objref_addr.is_zero());
            let result = unsafe { ObjectReference::from_raw_address_unchecked(objref_addr) };
            Ok(result)
        }
    }
}

The only condition that should result in unwrap failure should be if value.is_null(). That branch returns Err(NullRefError) and result in an unwrap check in post_alloc. That's expected. However, if value is not null, unsafe { ObjectReference::from_raw_address_unchecked(objref_addr) } should not involve any null check, and Ok(result) should make the unwrap() successful. So I can't explain why the Rust compiler generates a second je instruction after add esi, 0xfffffff4.

Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wks wks merged commit 5ecfd57 into mmtk:master Sep 3, 2024
3 checks passed
@wks
Copy link
Collaborator Author

wks commented Sep 3, 2024

The MIR (Rust's mid-level IR) for those functions still contains only one unwrap() operation, but the generated LLVM IR contains two zero checks, one for the jikes_obj and the other for jikes_obj + (-12). I think some compilation steps from the MIR to the LLVM IR yielded two unwrap() operations.

Anyway, this is a problem we can dig into later if this ever becomes a performance bottleneck. I merged this PR.

@qinsoon
Copy link
Member

qinsoon commented Sep 3, 2024

The second one is probably from ObjectReference::from_raw_address_unchecked(addr). NonZeroUsize cannot be zero, otherwise it is undefined behavior. Actually NonZeroUsize::new_unchecked() has a check at the source code level. I guess the compiler has some freedom to generate code for the None case to make sure it is not reached. So it just uses the same error branch as result.unwrap().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants