Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debugging capabilities for MacOS (Mach-O) #8054

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

esytnik
Copy link
Contributor

@esytnik esytnik commented Dec 19, 2023

This PR introduces support for debugging native-image applications on x86-64 and M processors on MacOS.
On x86 systems both gdb and lldb can be used and on M systems lldb is recommended because gdb isn't ported there.

Several points:

There are three commits -

_"move DWARF package out of linux (ELF) to allow reuse' _ - necessary but minimal refactoring of original DWARF support classes so that they could be subclassed while still having sensible hierarchy.

"make lo/hi ranges long instead of int to allow 8-byte values (as allo…" is a prerequisite because we have to work around MacOS ld intrusive behavior ("ld" doesn't produce correct offsets in __debug_info to __debug_abbrev section) when producing final application as well as long is a proper type to hold 8-byte value which is allowed by DWARF specs.

"add debugInfo for MacOS" - main commit that adds debugging capability to MacOS. It also has minimal changes to original DWARF support classes necessary to actually perform subclassing.

Short description of the method behind the PR:
native-image constructs correct DWARF sections but Mac's linker messes with them during linkage process and either fails to produce image or produces corrupted one. So to work around this behavior we save sections to temp files when building image, flag such sections as "debug" sections forcing linker to skip them and instead issue proper command line to linker to add these sections back to the final output from those temporary files.

This PR brings gdb debugging capabilities on MacOS on par with Linux - it allows breakpoints, viewing/navigating sources, step-in, step-out, getting locals etc. lldb also allows most of these operations.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Dec 19, 2023
@esytnik esytnik force-pushed the bell-sw-machodebuginfo branch from 9a558f0 to 9410dd4 Compare December 19, 2023 12:05
@fniephaus fniephaus requested review from adinn and olpaw December 19, 2023 12:37
@fniephaus
Copy link
Member

Thanks a lot for this PR, @esytnik. I have assigned reviewers and we'll run some tests soon.

@lewurm
Copy link
Member

lewurm commented Dec 19, 2023

Hey, great stuff! Thank you 🙂

Two quick notes:

There is a purposeful attempt to de-couple MacOS implementation from Linux one to allow easier development in case they divert from each other. At the same time there is an effort to leave new files as similar to their "origins" as possible to simplify reviewing process.

I'd prefer a common base class. I diffed two files manually, except for naming and copyrights there weren't any diffs. With your PR I've a hard time to spot the actual differences, so I'd actually argue in this state it's harder to review. Also my impression is that on the actual DWARF side there shouldn't be many differences (hopefully), but it's rather on the plumping side around the file format.

I tried it locally (darwin-aarch64) via:

$ mx build
$ mx helloworld -g

and got this:

java.lang.AssertionError: "java/lang/Object.java" not in string table
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.debugentry.StringTable.debugStringIndex(StringTable.java:92)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.debugentry.DebugInfoBase.debugStringIndex(DebugInfoBase.java:702)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMSectionImpl.debugStringIndex(DSYMSectionImpl.java:839)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMInfoSectionImpl.writeInstanceClassInfo(DSYMInfoSectionImpl.java:403)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMInfoSectionImpl.lambda$writeInstanceClasses$5(DSYMInfoSectionImpl.java:380)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708)
        at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMInfoSectionImpl.writeInstanceClasses(DSYMInfoSectionImpl.java:376)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMInfoSectionImpl.generateContent(DSYMInfoSectionImpl.java:167)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMInfoSectionImpl.createContent(DSYMInfoSectionImpl.java:108)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.dsym.DSYMSectionImpl.getOrDecideSize(DSYMSectionImpl.java:694)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.MachOUserDefinedSection.getOrDecideSize(MachOUserDefinedSection.java:116)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.ObjectFile.bake(ObjectFile.java:1673)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.MachOObjectFile.bake(MachOObjectFile.java:1839)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.ObjectFile.write(ObjectFile.java:1316)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.MachOObjectFile.lambda$write$2(MachOObjectFile.java:1846)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.ObjectFile.withDebugContext(ObjectFile.java:1849)
        at org.graalvm.nativeimage.objectfile/com.oracle.objectfile.macho.MachOObjectFile.write(MachOObjectFile.java:1844)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.image.NativeImage.write(NativeImage.java:167)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.image.NativeImageViaCC.write(NativeImageViaCC.java:97)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.doRun(NativeImageGenerator.java:731)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.run(NativeImageGenerator.java:537)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.buildImage(NativeImageGeneratorRunner.java:526)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.build(NativeImageGeneratorRunner.java:701)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.start(NativeImageGeneratorRunner.java:140)
        at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.main(NativeImageGeneratorRunner.java:95)

JAVA_HOME is a JDK21. What am I missing?

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023

Hi,

well, another thing is that DWARF at the moment is located under Elf, so subclassing those classes would make weird hierarchy and to have it done properly we'd need to move DWARF files up a level or two and sideways, which would make the commit more confusing, but if the consensus is to go this way I can definitely do it, perhaps through the auxiliary commit.

as for the other problem - I'll double-check if I haven't messed up the PR while rebasing etc in a bit when I have access to Darwin-aarch system but one thing I probably had to note - please use -O0 option as it is recommended everywhere.

@fniephaus fniephaus requested a review from lewurm December 19, 2023 14:34
@lewurm
Copy link
Member

lewurm commented Dec 19, 2023

please use -O0 option as it is recommended everywhere.

I tried -Ob, -O0 and -O1. Same error.

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023

please use -O0 option as it is recommended everywhere.

I tried -Ob, -O0 and -O1. Same error.

will double-check,

just FYI I used the following sequence while developing this PR:

mx -p substratevm build
path-to-generated-native-image/native-image -g -O0 Sample

@adinn
Copy link
Collaborator

adinn commented Dec 19, 2023

@esytnik Thanks for the PR. I'm on PTO until start of January but I will be happy to review it then.

well, another thing is that DWARF at the moment is located under Elf, so subclassing those classes would make weird hierarchy and to have it done properly we'd need to move DWARF files up a level or two and sideways, which would make the commit more confusing, but if the consensus is to go this way I can definitely do it, perhaps through the auxiliary commit.

I think it would be much better to do the relocation of the DWARF code as a separate preparatory commit. @fniephaus do you think we cna get that done before this goes in?

@fniephaus
Copy link
Member

I think it would be much better to do the relocation of the DWARF code as a separate preparatory commit.

Who's supposed to work on this again?

do you think we cna get that done before this goes in?

Sure, I agree that it'd make sense to do this in multiple steps.

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023

I think it would be much better to do the relocation of the DWARF code as a separate preparatory commit.

Who's supposed to work on this again?

do you think we cna get that done before this goes in?

Sure, I agree that it'd make sense to do this in multiple steps.

I'll do it. I suggest I add the third commit to this PR:

  1. relocation of DWARF classes
  2. int-to-long
  3. MacOS debug.

subclassing relocated dwarfs will drastically reduce number of new files, basically to one or two classes.

@fniephaus
Copy link
Member

I'll do it. I suggest I add the third commit to this PR

Perfect, sounds good to you, @adinn?

* so that linker could add them back to the final image without messing with their
* content. Lets generate corresponding ld options: -sectreate __DWARF
* <debug_section_name> <file> for each dumped debug session.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this works with gdb, but does it also work with lldb? As you noted, using gdb is not an option on darwin-aarch64.

This is not the MachO way imho. Consider this:

$ cat c.c
int main(void) {
    return 0;
}
$ clang -g -c c.c
$ objdump --section-headers c.o

c.o:    file format mach-o arm64

Sections:
Idx Name             Size     VMA              Type
  0 __text           00000014 0000000000000000 TEXT
  1 __debug_abbrev   0000003f 0000000000000014 DATA, DEBUG
  2 __debug_info     00000053 0000000000000053 DATA, DEBUG
  3 __debug_str      00000082 00000000000000a6 DATA, DEBUG
  4 __apple_names    0000003c 0000000000000128 DATA, DEBUG
  5 __apple_objc     00000024 0000000000000164 DATA, DEBUG
  6 __apple_namespac 00000024 0000000000000188 DATA, DEBUG
  7 __apple_types    00000047 00000000000001ac DATA, DEBUG
  8 __compact_unwind 00000020 00000000000001f8 DATA
  9 __debug_line     0000003a 0000000000000218 DATA, DEBUG
$ clang -o c c.o
$ objdump --section-headers c

c:      file format mach-o arm64

Sections:
Idx Name          Size     VMA              Type
  0 __text        00000014 0000000100003f94 TEXT
  1 __unwind_info 00000058 0000000100003fa8 DATA
$ dwarfdump c
c:      file format Mach-O arm64

.debug_info contents:
$ dsymutil c
$ rm c.o
rm c.o
$ lldb -- ./c
(lldb) target create "./c"
Current executable set to '/tmp/w/c' (arm64).
(lldb) b main
Breakpoint 1: where = c`main + 12 at c.c:2:5, address = 0x0000000100003fa0
(lldb) r
Process 57563 launched: '/tmp/w/c' (arm64)
Process 57563 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003fa0 c`main at c.c:2:5
   1    int main(void) {
-> 2        return 0;
   3    }
Target 0: (c) stopped.

That is, I think we should emit all those __debug* sections as part of the object file that native image emits. The linker will not include that into the final binary, but then we need an additional call to dsymutil. I prototyped this once but never upstreamed it, feel free to cherry-pick it or get inspired by it 😉 https://gist.github.com/lewurm/6b04cc770a4a36b7d4ebd5c52432287a

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023 via email

@stooke
Copy link
Contributor

stooke commented Dec 19, 2023

@esytnik, I am not a reviewer, but I am intimately familiar with the Windows CodeView code, having written it. This PR will break that code, as it alters the layout of CodeView records when it changes many calls from
CVUtil.putInt(proclen, buffer, pos)
to
CVUtil.putLong(proclen, buffer, pos)
While I wish CodeView were more clever, it isn't. It does impose some painful limits on object size.

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023

@esytnik, I am not a reviewer, but I am intimately familiar with the Windows CodeView code, having written it. This PR will break that code, as it alters the layout of CodeView records when it changes many calls from CVUtil.putInt(proclen, buffer, pos) to CVUtil.putLong(proclen, buffer, pos) While I wish CodeView were more clever, it isn't. It does impose some painful limits on object size.

ohp, well I'll make sure to leave it unchanged by casting those args to (int) and calling CVUtil.putInt there. Thanks for heads-up

@stooke
Copy link
Contributor

stooke commented Dec 19, 2023

@esytnik Could you please expand more on the need to move from int to long for Range, etc? While it makes sense intuitively, is it a must have? You only mention "we have to work around MacOS ld intrusive behaviour".

I am also curious about the need to write out sections to the filesystem for later reassembly. As far as I know, other compilers don't need to do this, so perhaps there is another way?

@esytnik
Copy link
Contributor Author

esytnik commented Dec 19, 2023

@esytnik Could you please expand more on the need to move from int to long for Range, etc? While it makes sense intuitively, is it a must have? You only mention "we have to work around MacOS ld intrusive behaviour".

this one is quite simple - to construct proper __debug_ranges section we have to put actual 8-byte addresses instead of offsets which are dealt properly by linkers on other OSes because on MacOS we have issues with the linker:

I am also curious about the need to write out sections to the filesystem for later reassembly. As far as I know, other compilers don't need to do this, so perhaps there is another way?

let me quote comments from MachoObjectFile that I've removed from this commit since it deals with this issue:

  • FIXME: set the DEBUG flag on this section. Unfortunately, this currently breaks
    * debugging: on OS X, the linker intentionally strips debug sections because
    * debuggers are expected to retrieve them from the original object files or from a
    * debug info archive. We should conform to this by creating a debug info archive
    * using dsymutil(1), which would also reduce the size of the linked binary.
    * However, attempts to implement this as in an extra step after linking has failed,
    * which likely means that more other stuff needs to be fixed beforehand.

Initially we saw that we could save .o files and gdb would use them as .dSYM files just fine. But unfortunately gdb doesn't work on AARCH MacOSx and quite silly lldb logic bug (yes, I've debugged lldb sources to figure it out) wouldn't allow to do the same in lldb even issuing "source-file" command. So we had to find the way to put these sections back into the final binary and this is the approach to do so. On a positive side it did remove dependency on having -H:TempDirectory and having .o saved.

@stooke
Copy link
Contributor

stooke commented Dec 19, 2023

@esytnik thanks for the informative response! For anyone who's interested, here's an ancient article about some OS X design choices: Apple's Lazy DWARF Scheme

I'm still confused about the "issues with the linker", and how it necessitates addresses vs offsets.

I agree with @lewurm 's comment about a separate dsymutil pass. It is the "macOS" way, although I prefer the "un*x" way myself.

@esytnik
Copy link
Contributor Author

esytnik commented Dec 20, 2023

@esytnik thanks for the informative response! For anyone who's interested, here's an ancient article about some OS X design choices: Apple's Lazy DWARF Scheme

I'm still confused about the "issues with the linker", and how it necessitates addresses vs offsets.

I agree with @lewurm 's comment about a separate dsymutil pass. It is the "macOS" way, although I prefer the "un*x" way myself.

there seems to be a confusion regarding dsymutil. It is pointless to apply it on the intermediate object - so we do need the final executable with debug info to apply dsymutil on it - and here we have it. One can simply call dsymutil on the executable and separate debug info into .dSYM if needed - although I personally think it quite pointless for debugging because it is just the way to store debug info outside of executable and for release it seems to be more logical to just re-generate image without debug info (and with optimizations).

@esytnik
Copy link
Contributor Author

esytnik commented Dec 20, 2023

@lewurm
Checking aarch MacOS issues. Just FYI example of lldb on x86 MacOS:

------------------------------------------------------------------------------------------------------------------------
                        2.7s (8.7% of total time) in 256 GCs | Peak RSS: 1.55GB | CPU load: 7.68
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /Users/bellsoft/esytnik/graal/gdbdemo (executable, debug_info)
 /Users/bellsoft/esytnik/graal/sources (debug_info)
========================================================================================================================
Finished generating 'gdbdemo' in 30.7s.
mac-macmini-x64-2:graal bellsoft$ uname -a
Darwin mac-macmini-x64-2.int.bell-sw.com 18.7.0 Darwin Kernel Version 18.7.0: Tue Jun 22 19:37:08 PDT 2021; root:xnu-4903.278.70~1/RELEASE_X86_64 x86_64
mac-macmini-x64-2:graal bellsoft$ lldb gdbdemo
(lldb) target create "gdbdemo"
Current executable set to 'gdbdemo' (x86_64).
(lldb) br s -r GDBDemo_ma
Breakpoint 1: where = gdbdemo`GDBDemo_main_9afa7deb673c244d5f51d055f2fd156822028b87 + 59, address = 0x000000010000110b
(lldb) run 4
Process 51921 launched: '/Users/bellsoft/esytnik/graal/gdbdemo' (x86_64)
Process 51921 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x000000010000110b gdbdemo`GDBDemo_main_9afa7deb673c244d5f51d055f2fd156822028b87 at GDBDemo.java:6
   3   	
   4   	     public static void main(String[] args) {
   5   	         if (args.length > 0) {
-> 6   	             int n = -1;
   7   	             try {
   8   	                 n = Integer.parseInt(args[0]);
   9   	             } catch (NumberFormatException ex) {
Target 0: (gdbdemo) stopped.
(lldb) n
Process 51921 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x000000010000110c gdbdemo`GDBDemo_main_9afa7deb673c244d5f51d055f2fd156822028b87 at GDBDemo.java:8
   5   	         if (args.length > 0) {
   6   	             int n = -1;
   7   	             try {
-> 8   	                 n = Integer.parseInt(args[0]);
   9   	             } catch (NumberFormatException ex) {
   10  	                 System.out.println(args[0] + " is not a number!");
   11  	             }
Target 0: (gdbdemo) stopped.

@esytnik
Copy link
Contributor Author

esytnik commented Dec 20, 2023

@lewurm

M1 appears to be working too:

bellsoft@mac-m1-arm64-0 graal % sdk/latest_graalvm_home/bin/native-image -O0 -g GDBDemo                                                                                                                                 
========================================================================================================================
GraalVM Native Image: Generating 'gdbdemo' (executable)...
========================================================================================================================
Warning: Using -g is limited and highly experimental on macOS
[1/8] Initializing...                                                                                    (4.9s @ 0.13GB)
 Java version: 21.0.1+12, vendor version: GraalVM CE 21.0.1-dev+12.1
 Graal compiler: optimization level: 0, target machine: armv8-a
 C compiler: cc (apple, arm64, 13.1.6)
 Garbage collector: Serial GC (max heap size: 80% of RAM)
 1 user-specific feature(s):
 - com.oracle.svm.thirdparty.gson.GsonFeature
------------------------------------------------------------------------------------------------------------------------
Build resources:
 - 12.09GB of memory (75.6% of 16.00GB system memory, determined at start)
 - 8 thread(s) (100.0% of 8 available processor(s), determined at start)
[2/8] Performing analysis...  [****]                                                                     (7.1s @ 0.37GB)
    3,329 reachable types   (73.1% of    4,554 total)
    3,935 reachable fields  (44.6% of    8,822 total)
   15,982 reachable methods (45.6% of   35,035 total)
    1,056 types,   208 fields, and   804 methods registered for reflection
       57 types,    56 fields, and    52 methods registered for JNI access
        4 native libraries: -framework Foundation, dl, pthread, z
[3/8] Building universe...                                                                               (0.9s @ 0.51GB)
[4/8] Parsing methods...      [*]                                                                        (0.7s @ 0.57GB)
[5/8] Inlining methods...     [**]                                                                       (0.2s @ 0.51GB)
[6/8] Compiling methods...    [***]                                                                      (6.5s @ 0.51GB)
[7/8] Laying out methods...   [*]                                                                        (0.8s @ 0.70GB)
[8/8] Creating image...       [**]                                                                       (4.8s @ 1.14GB)
   5.48MB (17.77%) for code area:    13,131 compilation units
   8.30MB (26.91%) for image heap:  100,023 objects and 47 resources
  14.74MB (47.81%) for debug info generated in 2.2s
   2.31MB ( 7.51%) for other data
  30.83MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 origins of code area:                                Top 10 object types in image heap:
   4.02MB java.base                                            1.91MB byte[] for code metadata
   1.07MB svm.jar (Native Image)                               1.31MB byte[] for java.lang.String
 117.22kB java.logging                                       963.00kB java.lang.String
  62.19kB org.graalvm.nativeimage.base                       786.80kB java.lang.Class
  32.36kB jdk.proxy1                                         346.31kB heap alignment
  31.09kB jdk.proxy3                                         286.09kB com.oracle.svm.core.hub.DynamicHubCompanion
  23.60kB org.graalvm.collections                            279.45kB byte[] for general heap data
  20.76kB jdk.internal.vm.ci                                 253.83kB java.util.HashMap$Node
   9.84kB jdk.graal.compiler                                 216.02kB java.lang.Object[]
   7.66kB jdk.proxy2                                         188.50kB java.lang.String[]
   7.28kB for 4 more packages                                  1.83MB for 956 more object types
------------------------------------------------------------------------------------------------------------------------
Recommendations:
 HEAP: Set max heap for improved and more predictable memory usage.
 CPU:  Enable more CPU features with '-march=native' for improved performance.
------------------------------------------------------------------------------------------------------------------------
                       2.8s (10.3% of total time) in 126 GCs | Peak RSS: 1.82GB | CPU load: 5.73
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /Users/bellsoft/esytnik/graal/gdbdemo (executable, debug_info)
 /Users/bellsoft/esytnik/graal/sources (debug_info)
========================================================================================================================
Finished generating 'gdbdemo' in 26.5s.

I used

bellsoft@mac-m1-arm64-0 graal % mx --arch aarch64 fetch-jdk        
WARNING: overriding detected architecture (arm64) with aarch64
[1]   labsjdk-ce-17             | ce-17.0.7+4-jvmci-23.1-b02
[2]   labsjdk-ce-17-debug       | ce-17.0.7+4-jvmci-23.1-b02
[3]   labsjdk-ce-17-llvm        | ce-17.0.7+4-jvmci-23.1-b02
[4]   labsjdk-ce-19             | ce-19.0.1+10-jvmci-23.0-b04
[5]   labsjdk-ce-19-debug       | ce-19.0.1+10-jvmci-23.0-b04
[6]   labsjdk-ce-19-llvm        | ce-19.0.1+10-jvmci-23.0-b04
[7]   labsjdk-ce-20             | ce-20.0.1+9-jvmci-23.1-b02
[8]   labsjdk-ce-20-debug       | ce-20.0.1+9-jvmci-23.1-b02
[9]   labsjdk-ce-20-llvm        | ce-20.0.1+9-jvmci-23.1-b02
[10]  labsjdk-ce-21             | ce-21.0.1+12-jvmci-23.1-b26
[11]  labsjdk-ce-21-debug       | ce-21.0.1+12-jvmci-23.1-b26
[12]  labsjdk-ce-21-llvm        | ce-21.0.1+12-jvmci-23.1-b26
[13]  labsjdk-ce-latest         | ce-22+27-jvmci-b01
[14]  labsjdk-ce-latest-debug   | ce-22+27-jvmci-b01
[15]  labsjdk-ce-latest-llvm    | ce-22+27-jvmci-b01
[16]  Other version

and chose 10

than built native-image with

mx -p substratevm build

@esytnik esytnik force-pushed the bell-sw-machodebuginfo branch from 9410dd4 to 15fc4a5 Compare December 20, 2023 12:54
@esytnik
Copy link
Contributor Author

esytnik commented Dec 20, 2023

@lewurm , Hi, I've fixed the issue you've encountered with

mx helloworld -g

We were crashing in the "log" (sic!) (DSYMInfoSectionImpl.java line 404) because we tried to get the name before it was written to the StringTable. I've switched two lines and it passed. It seems that we do not crash there on Linux by sheer luck

FYI here is the snippet of lldb

mac-macmini-x64-2:graal bellsoft$ lldb /Users/bellsoft/esytnik/graal/substratevm/svmbuild/helloworld
(lldb) target create "/Users/bellsoft/esytnik/graal/substratevm/svmbuild/helloworld"
Current executable set to '/Users/bellsoft/esytnik/graal/substratevm/svmbuild/helloworld' (x86_64).
(lldb) br s -r HelloWorld_ma
Breakpoint 1: where = helloworld`__text + 84, address = 0x0000000100001054
(lldb) run 5
Process 56803 launched: '/Users/bellsoft/esytnik/graal/substratevm/svmbuild/helloworld' (x86_64)
Process 56803 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100001054 helloworld`__text at System.java:1156
   1153	            sm.checkPermission(new RuntimePermission("getenv."+name));
   1154	        }
   1155	
-> 1156	        return ProcessEnvironment.getenv(name);
   1157	    }
   1158	
   1159	
Target 0: (helloworld) stopped.

@adinn I've moved DWARF package from Elf to under debuginfo, refactored DwarfDebugInfo class by moving most of its meat to the new DwarfDebugInfoBase class and have subsclasses MachO's DSYMDebugInfo from it. This change allowed to reuse most of dwarf.constants classes. There is still more to come.

@esytnik esytnik force-pushed the bell-sw-machodebuginfo branch 7 times, most recently from c78b623 to 076100d Compare December 26, 2023 13:57
@esytnik esytnik force-pushed the bell-sw-machodebuginfo branch from 076100d to 1bd2a61 Compare December 29, 2023 14:31
@adinn
Copy link
Collaborator

adinn commented Jan 2, 2024

I'll do it. I suggest I add the third commit to this PR
Perfect, sounds good to you, @adinn?

Well, what I really meant was to do the package move in a separate preliminary PR and rebase this PR on it. The current change is ok ...

@esytnik
Copy link
Contributor Author

esytnik commented Jan 2, 2024

I'll do it. I suggest I add the third commit to this PR
Perfect, sounds good to you, @adinn?

Well, what I really meant was to do the package move in a separate preliminary PR and rebase this PR on it. The current change is ok ...

Thanks for clarifying!

So, you suggest to take out the commit about moving classes (the first one), submit as a separate PR and when it is approved (looks like you’re fine with those changes) rebase current one, leaving it with 2 other commits as a result and move on from there? If so I’ll do it first thing first after vacation (January 9th)

@fniephaus
Copy link
Member

Well, what I really meant was to do the package move in a separate preliminary PR and rebase this PR on it.

@adinn what's the benefit of doing this in two PRs? Why aren't separate commits not enough?

@adinn
Copy link
Collaborator

adinn commented Jan 2, 2024

... after looking further into the details of the first commit, no it is not ok!

@esytnik The package rename also bundles in abstraction of super class DwarfDebugInfoBase from class DwarfDebugInfo plus some associated changes. These should be redone as two separate commits. I do not expect to have any comments regarding the package rename but I do have things to say about the other changes.

@esytnik
Copy link
Contributor Author

esytnik commented Jan 2, 2024

... after looking further into the details of the first commit, no it is not ok!

@esytnik The package rename also bundles in abstraction of super class DwarfDebugInfoBase from class DwarfDebugInfo plus some associated changes. These should be redone as two separate commits. I do not expect to have any comments regarding the package rename but I do have things to say about the other changes.

Hmm, that’s strange, must be packaging issue (when splitting changes). I did try to just move classes without any changes.
DwarfDebugInfoBase should have been introduced only in the last commit

@adinn
Copy link
Collaborator

adinn commented Jan 2, 2024

I do not expect to have any comments regarding the package rename . . .

@esytnik Well, another expectation confounded.

You have relocated the package tree at com.oracle.objectfile.elf.dwarf to com.oracle.objectfile.debuginfo.dwarf. That is inappropriate.

Package com.oracle.objectfile.debuginfo contains code that defines the interface between the 'GraalVM Native Image Generator' and the generic 'Debug Info Model' classes in package com.oracle.objectfile.debugentry.

Package com.oracle.objectfile.elf.debuginfo.dwarf contains classes which consume the generic Debug Info Model and generate DWARF section content. It also contains a subpackage com.oracle.objectfile.elf.debuginfo.dwarf.constants that models constant values defined in the DWARF specs.

So, these two sets of classes are unrelated and should sit in separate packages under com.oracle.objectfile

You need to relocate com.oracle.objectfile.elf.dwarf and its subpackage to com.oracle.objectfile.dwarf.

@adinn
Copy link
Collaborator

adinn commented Jan 2, 2024

@adinn what's the benefit of doing this in two PRs? Why aren't separate commits not enough?

@fniephaus Well, mainly because it gets something that ought to be very simple out of the way without the possibility of it being confused with/masking the scope of later changes.

The package relocate should be a simple uncontentious move of the existing files to a suitable destination. It turns out to have been contentious. That was not just because the wrong destination was chosen but also because the move got bundled in with other changes i.e. splitting up and modifying content in the relocated files.

If you want to avoid multiplying PRs then it would suffice to separate the first commit in the current PR into two commits within the same PR, a pure move and a refactor/edit. However, I think it would make reviewing simpler if we could do it in two PRs.

@@ -369,7 +369,7 @@ private TypeEntry createTypeEntry(String typeName, String fileName, Path filePat
case INSTANCE: {
FileEntry fileEntry = addFileEntry(fileName, filePath);
typeEntry = new ClassEntry(typeName, fileEntry, size);
if (typeEntry.getTypeName().equals(DwarfDebugInfo.HUB_TYPE_NAME)) {
if (typeEntry.getTypeName().equals(DwarfDebugInfoBase.HUB_TYPE_NAME)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong resolution for the problem here. The constant HUB_TYPE_NAME should never have been located in subclass DwarfDebugInfo -- its only use is from the current class DebugInfoBase. What is needed to fix this issue is to promote this member to DebugInfoBase and make it private.

@@ -34,6 +34,7 @@
import java.util.List;
import java.util.Map;

import com.oracle.objectfile.debuginfo.dwarf.DwarfDebugInfoBase;
Copy link
Collaborator

@adinn adinn Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please relocate this import so it sits with the other com.oracle imports in the correct alphabetical order.

@@ -24,13 +24,13 @@
* questions.
*/

package com.oracle.objectfile.elf.dwarf;
package com.oracle.objectfile.debuginfo.dwarf;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please group and order the com.oracle imports

}

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your have re-ordered this inner class and that of two other classes. You have also re-ordered quite a few of the methods. That might be ok if you had done it consistently, thoroughly and with a proper rationale for the re-organization. However, I notice, for example, that you have not moved class DwarfLocalProperties to group it with these classes so it looks like there is no such rationale. Indeed, it appears this re-ordering has happened simply because you have used an IDE to move things around when factoring out the base class. Can you please rework the change to minimize the differences between the original class definition and this one. That will not only help this review it will also make it much simpler if we need to backport fixes.

@@ -317,7 +317,7 @@ public boolean hasCompiledEntries() {
return compiledEntryCount() != 0;
}

public int compiledEntriesBase() {
public long compiledEntriesBase() {
Copy link
Collaborator

@adinn adinn Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change to the debug info model code is not appropriate. We are nowhere near being in a position where we will have DWARF sections or code (.text) sections whose length exceeds 32 bits. Likewise for the Microsoft CV records. So, a model that employs int offsets is perfectly adequate as a model. The fact that a specific back end needs to write out 64 bit absolute addresses is not a valid reason to force a change like this on the model.

If the MACH-O/DWARF back end needs to write DWARF using an 8-bit offset format then we need to change the DWARF generation code so that it can be configured to support 4- or 8-bit offsets and have ELF and MACH-O configure it accordingly.

If the MACH-O/DWARF back end wants abuse the DWARF model and write 8-bit absolute addresses into some of the DWARF record slots that represent offsets then it should achieve that by converting the offsets to absolute addresses in the back end, applying whatever translation it needs at the point of writing.

I'm not yet convinced we need to do that last step anyway which is an even stronger reason not to make this change.

public Section newDebugSection(String name, ElementImpl impl) {
final Segment segment = getOrCreateSegment(null, name, false, false);
public Section newDebugSection(String segmentName, String name, ElementImpl impl) {
final Segment segment = getOrCreateSegment(segmentName, name, false, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-order this new method after the redefined version of original so the changes shown in the diff are clearer.

@@ -1289,7 +1297,7 @@ public Element getOffsetBootstrapElement() {

private final HashSet<LayoutDecision> allDecisions = new HashSet<>();
private final Map<Element, LayoutDecisionMap> decisionsByElement = new IdentityHashMap<>();
private final Map<Element, LayoutDecisionMap> decisionsTaken = new IdentityHashMap<>();
public final Map<Element, LayoutDecisionMap> decisionsTaken = new IdentityHashMap<>();
Copy link
Collaborator

@adinn adinn Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field appears only to be needed by ObjectFile and subclass MachOObjectFile. So, it would be sensible to make it protected rather than public. Even better might be to offer a protected lookup method:

protected Object getLayoutDecisionTakenValue(Element e, LayoutDecision.Kind k) {
    return decisionsTaken.get(e).getDecidedValue(k));
}

markRelocationSite(pos, ObjectFile.RelocationKind.DIRECT_8, DwarfSectionName.TEXT_SECTION.value(), l);
pos = writeLong(0, buffer, pos);
markRelocationSite(pos, ObjectFile.RelocationKind.DIRECT_8, dwarfSections.textSectionName().value(), l);
pos = writeLong(dwarfSections.relocatableLong(l), buffer, pos);
Copy link
Collaborator

@adinn adinn Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely clear why you are writing a long value at the current byte position rather than 0. Normally this content is written as zero and gets initialized with start_address(".text") + l when the DWARF section is linked into the final executable. That happens because the associated debug reloc section identifies this location as requiring an 8-byte symbol+offset relocation.

It looks like for MACH-O you are trying to populate this location with an absolute code address that targets the correct instruction in the text section. Is that correct? Also, are you doing that that because you don't have any way of associating a relocation with this location? or is it because llvm will not apply a relocation correctly during final image generation?

I am asking because it appears that for this to work you are relying on modifying all code offsets that come into the model by adding 2^32 + PAGE_SIZE. That would make sense if the text section starts at 2^32 + PAGE_SIZE (which certainly seems to fit with the name PAGEZERO_SIZE you have chosen for the basic offset).

The problem here is that this trick will only work to ensure that relocatable code offset locations end up with a correct instruction address. It is not going to work to generate correct heap addresses or dwarf section addresses which can also appear in DWARF content and need to be relocated. If you look at the two methods which follow, putRelocatableHeapOffset and putRelocatableDwarfSectionOffset they mark a location in the DWARF data as requiring relocation relative to either the symbol defined by HEAP_BEGIN_NAME or the symbol which identifies the start of a specific DWARF section. How are you proposing to ensure those offsets are generated correctly?

Copy link
Collaborator

@adinn adinn Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you really want to transform the 4 byte offsets to 8 byte absolute addresses then you should probably be doing it here instead of in the model code.

@stooke
Copy link
Contributor

stooke commented Jan 2, 2024

@esytnik , I"m curious about the logic bug you describe in lldb. Could you please expand on that? If it's an actual issue, it should be reported upstream, and if the fix is simple perhaps a PR can be submitted. This is something @adinn had to do with a problem in gdb, for example.

The reason I want to understand more about this is because this lldb issue appears to be a driver for many of the choices made in the PR; the temp files for debug code, the page 0 offset, and the long address fields in Range, for example. Please correct me if I'm wrong, because I've definitely missed something if so.

I have a developer's account at Apple, for example, and would be happy to work with you to open a technical support incident if the fix gets accepted in lldb upstream.

@esytnik esytnik force-pushed the bell-sw-machodebuginfo branch from 1bd2a61 to 09dd697 Compare January 15, 2024 08:58
@esytnik
Copy link
Contributor Author

esytnik commented Jan 15, 2024

@adinn, resolved several comments and made first commit clean (just moving code) and to the package you've requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
Status: To do
Development

Successfully merging this pull request may close these issues.

5 participants