-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add debugging capabilities for MacOS (Mach-O) #8054
base: master
Are you sure you want to change the base?
Conversation
9a558f0
to
9410dd4
Compare
Thanks a lot for this PR, @esytnik. I have assigned reviewers and we'll run some tests soon. |
Hey, great stuff! Thank you 🙂 Two quick notes:
I'd prefer a common base class. I diffed two files manually, except for naming and copyrights there weren't any diffs. With your PR I've a hard time to spot the actual differences, so I'd actually argue in this state it's harder to review. Also my impression is that on the actual DWARF side there shouldn't be many differences (hopefully), but it's rather on the plumping side around the file format. I tried it locally (darwin-aarch64) via:
and got this:
|
Hi, well, another thing is that DWARF at the moment is located under Elf, so subclassing those classes would make weird hierarchy and to have it done properly we'd need to move DWARF files up a level or two and sideways, which would make the commit more confusing, but if the consensus is to go this way I can definitely do it, perhaps through the auxiliary commit. as for the other problem - I'll double-check if I haven't messed up the PR while rebasing etc in a bit when I have access to Darwin-aarch system but one thing I probably had to note - please use -O0 option as it is recommended everywhere. |
I tried |
will double-check, just FYI I used the following sequence while developing this PR: mx -p substratevm build |
@esytnik Thanks for the PR. I'm on PTO until start of January but I will be happy to review it then.
I think it would be much better to do the relocation of the DWARF code as a separate preparatory commit. @fniephaus do you think we cna get that done before this goes in? |
Who's supposed to work on this again?
Sure, I agree that it'd make sense to do this in multiple steps. |
I'll do it. I suggest I add the third commit to this PR:
subclassing relocated dwarfs will drastically reduce number of new files, basically to one or two classes. |
Perfect, sounds good to you, @adinn? |
* so that linker could add them back to the final image without messing with their | ||
* content. Lets generate corresponding ld options: -sectreate __DWARF | ||
* <debug_section_name> <file> for each dumped debug session. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this works with gdb
, but does it also work with lldb
? As you noted, using gdb
is not an option on darwin-aarch64.
This is not the MachO way imho. Consider this:
$ cat c.c
int main(void) {
return 0;
}
$ clang -g -c c.c
$ objdump --section-headers c.o
c.o: file format mach-o arm64
Sections:
Idx Name Size VMA Type
0 __text 00000014 0000000000000000 TEXT
1 __debug_abbrev 0000003f 0000000000000014 DATA, DEBUG
2 __debug_info 00000053 0000000000000053 DATA, DEBUG
3 __debug_str 00000082 00000000000000a6 DATA, DEBUG
4 __apple_names 0000003c 0000000000000128 DATA, DEBUG
5 __apple_objc 00000024 0000000000000164 DATA, DEBUG
6 __apple_namespac 00000024 0000000000000188 DATA, DEBUG
7 __apple_types 00000047 00000000000001ac DATA, DEBUG
8 __compact_unwind 00000020 00000000000001f8 DATA
9 __debug_line 0000003a 0000000000000218 DATA, DEBUG
$ clang -o c c.o
$ objdump --section-headers c
c: file format mach-o arm64
Sections:
Idx Name Size VMA Type
0 __text 00000014 0000000100003f94 TEXT
1 __unwind_info 00000058 0000000100003fa8 DATA
$ dwarfdump c
c: file format Mach-O arm64
.debug_info contents:
$ dsymutil c
$ rm c.o
rm c.o
$ lldb -- ./c
(lldb) target create "./c"
Current executable set to '/tmp/w/c' (arm64).
(lldb) b main
Breakpoint 1: where = c`main + 12 at c.c:2:5, address = 0x0000000100003fa0
(lldb) r
Process 57563 launched: '/tmp/w/c' (arm64)
Process 57563 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003fa0 c`main at c.c:2:5
1 int main(void) {
-> 2 return 0;
3 }
Target 0: (c) stopped.
That is, I think we should emit all those __debug*
sections as part of the object file that native image emits. The linker will not include that into the final binary, but then we need an additional call to dsymutil
. I prototyped this once but never upstreamed it, feel free to cherry-pick it or get inspired by it 😉 https://gist.github.com/lewurm/6b04cc770a4a36b7d4ebd5c52432287a
It does work in lldb as I mentioned in PR
|
@esytnik, I am not a reviewer, but I am intimately familiar with the Windows CodeView code, having written it. This PR will break that code, as it alters the layout of CodeView records when it changes many calls from |
ohp, well I'll make sure to leave it unchanged by casting those args to (int) and calling CVUtil.putInt there. Thanks for heads-up |
@esytnik Could you please expand more on the need to move from int to long for Range, etc? While it makes sense intuitively, is it a must have? You only mention "we have to work around MacOS ld intrusive behaviour". I am also curious about the need to write out sections to the filesystem for later reassembly. As far as I know, other compilers don't need to do this, so perhaps there is another way? |
this one is quite simple - to construct proper __debug_ranges section we have to put actual 8-byte addresses instead of offsets which are dealt properly by linkers on other OSes because on MacOS we have issues with the linker:
let me quote comments from MachoObjectFile that I've removed from this commit since it deals with this issue:
Initially we saw that we could save .o files and gdb would use them as .dSYM files just fine. But unfortunately gdb doesn't work on AARCH MacOSx and quite silly lldb logic bug (yes, I've debugged lldb sources to figure it out) wouldn't allow to do the same in lldb even issuing "source-file" command. So we had to find the way to put these sections back into the final binary and this is the approach to do so. On a positive side it did remove dependency on having -H:TempDirectory and having .o saved. |
@esytnik thanks for the informative response! For anyone who's interested, here's an ancient article about some OS X design choices: Apple's Lazy DWARF Scheme I'm still confused about the "issues with the linker", and how it necessitates addresses vs offsets. I agree with @lewurm 's comment about a separate dsymutil pass. It is the "macOS" way, although I prefer the "un*x" way myself. |
there seems to be a confusion regarding dsymutil. It is pointless to apply it on the intermediate object - so we do need the final executable with debug info to apply dsymutil on it - and here we have it. One can simply call dsymutil on the executable and separate debug info into .dSYM if needed - although I personally think it quite pointless for debugging because it is just the way to store debug info outside of executable and for release it seems to be more logical to just re-generate image without debug info (and with optimizations). |
@lewurm
|
M1 appears to be working too:
I used
and chose than built native-image with
|
9410dd4
to
15fc4a5
Compare
@lewurm , Hi, I've fixed the issue you've encountered with
We were crashing in the "log" (sic!) (DSYMInfoSectionImpl.java line 404) because we tried to get the name before it was written to the StringTable. I've switched two lines and it passed. It seems that we do not crash there on Linux by sheer luck FYI here is the snippet of lldb
@adinn I've moved DWARF package from Elf to under debuginfo, refactored |
c78b623
to
076100d
Compare
076100d
to
1bd2a61
Compare
Well, what I really meant was to do the package move in a separate preliminary PR and rebase this PR on it. The current change is ok ... |
Thanks for clarifying! So, you suggest to take out the commit about moving classes (the first one), submit as a separate PR and when it is approved (looks like you’re fine with those changes) rebase current one, leaving it with 2 other commits as a result and move on from there? If so I’ll do it first thing first after vacation (January 9th) |
@adinn what's the benefit of doing this in two PRs? Why aren't separate commits not enough? |
... after looking further into the details of the first commit, no it is not ok! @esytnik The package rename also bundles in abstraction of super class |
Hmm, that’s strange, must be packaging issue (when splitting changes). I did try to just move classes without any changes. |
@esytnik Well, another expectation confounded. You have relocated the package tree at Package Package So, these two sets of classes are unrelated and should sit in separate packages under You need to relocate |
@fniephaus Well, mainly because it gets something that ought to be very simple out of the way without the possibility of it being confused with/masking the scope of later changes. The package relocate should be a simple uncontentious move of the existing files to a suitable destination. It turns out to have been contentious. That was not just because the wrong destination was chosen but also because the move got bundled in with other changes i.e. splitting up and modifying content in the relocated files. If you want to avoid multiplying PRs then it would suffice to separate the first commit in the current PR into two commits within the same PR, a pure move and a refactor/edit. However, I think it would make reviewing simpler if we could do it in two PRs. |
@@ -369,7 +369,7 @@ private TypeEntry createTypeEntry(String typeName, String fileName, Path filePat | |||
case INSTANCE: { | |||
FileEntry fileEntry = addFileEntry(fileName, filePath); | |||
typeEntry = new ClassEntry(typeName, fileEntry, size); | |||
if (typeEntry.getTypeName().equals(DwarfDebugInfo.HUB_TYPE_NAME)) { | |||
if (typeEntry.getTypeName().equals(DwarfDebugInfoBase.HUB_TYPE_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the wrong resolution for the problem here. The constant HUB_TYPE_NAME
should never have been located in subclass DwarfDebugInfo
-- its only use is from the current class DebugInfoBase
. What is needed to fix this issue is to promote this member to DebugInfoBase
and make it private.
@@ -34,6 +34,7 @@ | |||
import java.util.List; | |||
import java.util.Map; | |||
|
|||
import com.oracle.objectfile.debuginfo.dwarf.DwarfDebugInfoBase; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please relocate this import so it sits with the other com.oracle imports in the correct alphabetical order.
@@ -24,13 +24,13 @@ | |||
* questions. | |||
*/ | |||
|
|||
package com.oracle.objectfile.elf.dwarf; | |||
package com.oracle.objectfile.debuginfo.dwarf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please group and order the com.oracle imports
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your have re-ordered this inner class and that of two other classes. You have also re-ordered quite a few of the methods. That might be ok if you had done it consistently, thoroughly and with a proper rationale for the re-organization. However, I notice, for example, that you have not moved class DwarfLocalProperties
to group it with these classes so it looks like there is no such rationale. Indeed, it appears this re-ordering has happened simply because you have used an IDE to move things around when factoring out the base class. Can you please rework the change to minimize the differences between the original class definition and this one. That will not only help this review it will also make it much simpler if we need to backport fixes.
@@ -317,7 +317,7 @@ public boolean hasCompiledEntries() { | |||
return compiledEntryCount() != 0; | |||
} | |||
|
|||
public int compiledEntriesBase() { | |||
public long compiledEntriesBase() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change to the debug info model code is not appropriate. We are nowhere near being in a position where we will have DWARF sections or code (.text
) sections whose length exceeds 32 bits. Likewise for the Microsoft CV records. So, a model that employs int
offsets is perfectly adequate as a model. The fact that a specific back end needs to write out 64 bit absolute addresses is not a valid reason to force a change like this on the model.
If the MACH-O/DWARF back end needs to write DWARF using an 8-bit offset format then we need to change the DWARF generation code so that it can be configured to support 4- or 8-bit offsets and have ELF and MACH-O configure it accordingly.
If the MACH-O/DWARF back end wants abuse the DWARF model and write 8-bit absolute addresses into some of the DWARF record slots that represent offsets then it should achieve that by converting the offsets to absolute addresses in the back end, applying whatever translation it needs at the point of writing.
I'm not yet convinced we need to do that last step anyway which is an even stronger reason not to make this change.
public Section newDebugSection(String name, ElementImpl impl) { | ||
final Segment segment = getOrCreateSegment(null, name, false, false); | ||
public Section newDebugSection(String segmentName, String name, ElementImpl impl) { | ||
final Segment segment = getOrCreateSegment(segmentName, name, false, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please re-order this new method after the redefined version of original so the changes shown in the diff are clearer.
@@ -1289,7 +1297,7 @@ public Element getOffsetBootstrapElement() { | |||
|
|||
private final HashSet<LayoutDecision> allDecisions = new HashSet<>(); | |||
private final Map<Element, LayoutDecisionMap> decisionsByElement = new IdentityHashMap<>(); | |||
private final Map<Element, LayoutDecisionMap> decisionsTaken = new IdentityHashMap<>(); | |||
public final Map<Element, LayoutDecisionMap> decisionsTaken = new IdentityHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field appears only to be needed by ObjectFile
and subclass MachOObjectFile
. So, it would be sensible to make it protected
rather than public
. Even better might be to offer a protected lookup method:
protected Object getLayoutDecisionTakenValue(Element e, LayoutDecision.Kind k) {
return decisionsTaken.get(e).getDecidedValue(k));
}
markRelocationSite(pos, ObjectFile.RelocationKind.DIRECT_8, DwarfSectionName.TEXT_SECTION.value(), l); | ||
pos = writeLong(0, buffer, pos); | ||
markRelocationSite(pos, ObjectFile.RelocationKind.DIRECT_8, dwarfSections.textSectionName().value(), l); | ||
pos = writeLong(dwarfSections.relocatableLong(l), buffer, pos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely clear why you are writing a long value at the current byte position rather than 0. Normally this content is written as zero and gets initialized with start_address(".text") + l
when the DWARF section is linked into the final executable. That happens because the associated debug reloc section identifies this location as requiring an 8-byte symbol+offset relocation.
It looks like for MACH-O you are trying to populate this location with an absolute code address that targets the correct instruction in the text section. Is that correct? Also, are you doing that that because you don't have any way of associating a relocation with this location? or is it because llvm will not apply a relocation correctly during final image generation?
I am asking because it appears that for this to work you are relying on modifying all code offsets that come into the model by adding 2^32 + PAGE_SIZE. That would make sense if the text section starts at 2^32 + PAGE_SIZE (which certainly seems to fit with the name PAGEZERO_SIZE you have chosen for the basic offset).
The problem here is that this trick will only work to ensure that relocatable code offset locations end up with a correct instruction address. It is not going to work to generate correct heap addresses or dwarf section addresses which can also appear in DWARF content and need to be relocated. If you look at the two methods which follow, putRelocatableHeapOffset
and putRelocatableDwarfSectionOffset
they mark a location in the DWARF data as requiring relocation relative to either the symbol defined by HEAP_BEGIN_NAME
or the symbol which identifies the start of a specific DWARF section. How are you proposing to ensure those offsets are generated correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if you really want to transform the 4 byte offsets to 8 byte absolute addresses then you should probably be doing it here instead of in the model code.
@esytnik , I"m curious about the logic bug you describe in lldb. Could you please expand on that? If it's an actual issue, it should be reported upstream, and if the fix is simple perhaps a PR can be submitted. This is something @adinn had to do with a problem in gdb, for example. The reason I want to understand more about this is because this lldb issue appears to be a driver for many of the choices made in the PR; the temp files for debug code, the page 0 offset, and the long address fields in Range, for example. Please correct me if I'm wrong, because I've definitely missed something if so. I have a developer's account at Apple, for example, and would be happy to work with you to open a technical support incident if the fix gets accepted in lldb upstream. |
1bd2a61
to
09dd697
Compare
@adinn, resolved several comments and made first commit clean (just moving code) and to the package you've requested. |
This PR introduces support for debugging native-image applications on x86-64 and M processors on MacOS.
On x86 systems both
gdb
andlldb
can be used and on M systemslldb
is recommended becausegdb
isn't ported there.Several points:
There are three commits -
_"move DWARF package out of linux (ELF) to allow reuse' _ - necessary but minimal refactoring of original DWARF support classes so that they could be subclassed while still having sensible hierarchy.
"make lo/hi ranges long instead of int to allow 8-byte values (as allo…" is a prerequisite because we have to work around MacOS
ld
intrusive behavior ("ld" doesn't produce correct offsets in __debug_info to __debug_abbrev section) when producing final application as well as long is a proper type to hold 8-byte value which is allowed by DWARF specs."add debugInfo for MacOS" - main commit that adds debugging capability to MacOS. It also has minimal changes to original DWARF support classes necessary to actually perform subclassing.
Short description of the method behind the PR:
native-image constructs correct DWARF sections but Mac's linker messes with them during linkage process and either fails to produce image or produces corrupted one. So to work around this behavior we save sections to temp files when building image, flag such sections as "debug" sections forcing linker to skip them and instead issue proper command line to linker to add these sections back to the final output from those temporary files.
This PR brings
gdb
debugging capabilities on MacOS on par with Linux - it allows breakpoints, viewing/navigating sources, step-in, step-out, getting locals etc.lldb
also allows most of these operations.