Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in ObjWriter on ARM during publish #1388

Closed
kant2002 opened this issue Aug 6, 2021 · 26 comments · Fixed by dotnet/runtime#57257
Closed

Crash in ObjWriter on ARM during publish #1388

kant2002 opened this issue Aug 6, 2021 · 26 comments · Fixed by dotnet/runtime#57257

Comments

@kant2002
Copy link
Contributor

kant2002 commented Aug 6, 2021

After #1387 during publish ILC crashed with following stack trace.

Thread 1 "ilc" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x76c36510 (LWP 26921)]
0x268d53f8 in ?? ()
(gdb) backtrace
#0  0x268d53f8 in ?? ()
#1  0x460addb4 in llvm::MCExpr::evaluateAsAbsolute(long long&, llvm::MCAssembler const&) const () from /home/pi/runtimelab/samples/HelloWorld/pkg/runtime.linux-arm.microsoft.dotnet.ilcompiler/6.0.0-dev/tools/libobjwriter.so
#2  0x453a3e40 in ObjectWriter::EmitSymbolRef(char const*, RelocType, int) () from /home/pi/runtimelab/samples/HelloWorld/pkg/runtime.linux-arm.microsoft.dotnet.ilcompiler/6.0.0-dev/tools/libobjwriter.so
#3  0x453a36b0 in EmitSymbolRef () from /home/pi/runtimelab/samples/HelloWorld/pkg/runtime.linux-arm.microsoft.dotnet.ilcompiler/6.0.0-dev/tools/libobjwriter.so
#4  0x52a19706 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I try to look at locals, but without luck

(gdb) frame 2
#2  0x453a3e40 in ObjectWriter::EmitSymbolRef(char const*, RelocType, int) () from /home/pi/runtimelab/samples/HelloWorld/pkg/runtime.linux-arm.microsoft.dotnet.ilcompiler/6.0.0-dev/tools/libobjwriter.so
(gdb) info locals
No symbol table info available.

Even if I build LLVM and ObjWriter on this Raspberry using ./build.sh nativeaot.objwriter -rc Debug -lc Release. It will take a bit of time to figure out what's going on.

@kant2002
Copy link
Contributor Author

When I try run ILC on Windows, On the ObjWriter stage probably I receive LLVM ERROR: unsupported relocation on symbol

That's seems to be from one of these places
https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/lib/Target/ARM/MCTargetDesc/ARMELFObjectWriter.cpp#L90
or
https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/lib/Target/ARM/MCTargetDesc/ARMELFObjectWriter.cpp#L162

I understand that this maybe invalid case and @MichalStrehovsky advice to run on Windows does not applied.

@MichalStrehovsky
Copy link
Member

Which reloc is that? (It should be visible which data blob the reloc is coming from on the managed side, in ObjectWriter.cs).

Some work was done to support the various ARM32 ELF relocs, but maybe we regressed somewhere (e.g. dotnet/corert#4899).

@kant2002
Copy link
Contributor Author

I actually not sure where this happens. since this is not a crash, just "unsupported" scenario which exit ILC. I will try to find what reloc is that. Also thanks for the hint, I will take a look and try to understand what can go wrong.

@kant2002
Copy link
Contributor Author

Okay. This is some sort of crash. At least application exits at call to FinishObjWriter, and since I run it on Windows it's from Nuget package, which miss PDB. If I compile ObjWriter.dll on Windows, can I expect that it would work for ARM ELF generation?

@kant2002
Copy link
Contributor Author

Error happens when MCFixup.getKind() == FK_Data_8 (3), IsPCRel == false at this line
https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/lib/Target/ARM/MCTargetDesc/ARMELFObjectWriter.cpp#L162

Stack trace.

objwriter.dll!`anonymous namespace'::ARMELFObjectWriter::GetRelocTypeInner(const llvm::MCValue & Target, const llvm::MCFixup & Fixup, bool IsPCRel, llvm::MCContext & Ctx) Line 167
	at runtimelab\artifacts\llvm\lib\Target\ARM\MCTargetDesc\ARMELFObjectWriter.cpp(167)
objwriter.dll!`anonymous namespace'::ARMELFObjectWriter::getRelocType(llvm::MCContext & Ctx, const llvm::MCValue & Target, const llvm::MCFixup & Fixup, bool IsPCRel) Line 73
	at runtimelab\artifacts\llvm\lib\Target\ARM\MCTargetDesc\ARMELFObjectWriter.cpp(73)
[Inline Frame] objwriter.dll!`anonymous-namespace'::ELFObjectWriter::getRelocType(llvm::MCContext &) Line 155
	at runtimelab\artifacts\llvm\lib\MC\ELFObjectWriter.cpp(155)
objwriter.dll!`anonymous namespace'::ELFObjectWriter::recordRelocation(llvm::MCAssembler & Asm, const llvm::MCAsmLayout & Layout, const llvm::MCFragment * Fragment, const llvm::MCFixup & Fixup, llvm::MCValue Target, unsigned __int64 & FixedValue) Line 694
	at runtimelab\artifacts\llvm\lib\MC\ELFObjectWriter.cpp(694)
objwriter.dll!llvm::MCAssembler::handleFixup(const llvm::MCAsmLayout & Layout, llvm::MCFragment & F, const llvm::MCFixup & Fixup) Line 661
	at runtimelab\artifacts\llvm\lib\MC\MCAssembler.cpp(661)
objwriter.dll!llvm::MCAssembler::layout(llvm::MCAsmLayout & Layout) Line 743
	at runtimelab\artifacts\llvm\lib\MC\MCAssembler.cpp(743)
objwriter.dll!llvm::MCAssembler::Finish() Line 755
	at runtimelab\artifacts\llvm\lib\MC\MCAssembler.cpp(755)
[Inline Frame] objwriter.dll!ObjectWriter::Finish() Line 188
	at runtimelab\src\coreclr\tools\aot\ObjWriter\objwriter.cpp(188)
objwriter.dll!FinishObjWriter(ObjectWriter * OW) Line 208
	at runtimelab\src\coreclr\tools\aot\ObjWriter\objwriter.h(208)
[Managed to Native Transition]
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.Dispose(bool bDisposing) Line 891
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(891)
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.Dispose() Line 881
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(881)
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.EmitObject(string objectFilePath, System.Collections.Generic.IEnumerable<ILCompiler.DependencyAnalysisFramework.DependencyNode> nodes, ILCompiler.DependencyAnalysis.NodeFactory factory, ILCompiler.DependencyAnalysis.ObjectWritingOptions options, ILCompiler.DependencyAnalysis.IObjectDumper dumper) Line 1157
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(1157)
ILCompiler.RyuJit.dll!ILCompiler.RyuJitCompilation.CompileInternal(string outputFile, ILCompiler.ObjectDumper dumper) Line 75
	at runtimelab\src\coreclr\tools\aot\ILCompiler.RyuJit\Compiler\RyuJitCompilation.cs(75)
ILCompiler.Compiler.dll!ILCompiler.Compilation.ILCompiler.ICompilation.Compile(string outputFile, ILCompiler.ObjectDumper dumper) Line 503
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\Compilation.cs(503)
ilc.dll!ILCompiler.Program.Run(string[] args) Line 757
	at runtimelab\src\coreclr\tools\aot\ILCompiler\Program.cs(757)
ilc.dll!ILCompiler.Program.Main(string[] args) Line 928
	at runtimelab\src\coreclr\tools\aot\ILCompiler\Program.cs(928)

So far, I will try to get reloc from that stack trace. But not sure what I should look for and where to continue search for now. Will continue, but gladly ask for any help or advice.

@kant2002
Copy link
Contributor Author

From what I understand.
The issue happens in first Section (index 0) and 3rd Fragment of that section (index 2) on first Fixup (index 0).
Section 0 is "__managed_code". Fixup has Offset == 0 and seems to be point to Loc == NULL.
And now I'm stuck. My next steps would be understanding who specifically wrote that Section, Fragment and who detect or write Fixups.

libObjwriter seems to be producing optimized code, so debugging of iterators a bit problematic. Is it possible to build it in Debug mode. I was using build.cmd nativeaot.objwriter -rc Debug -lc Release.

@MichalStrehovsky
Copy link
Member

ILCompiler.csproj is hardcoded to pick up the Release version of ObjWriter. Look for ObjWriterArtifactPath - this needs to be set to Debug to pick up your debug build of it. I think it's also hardcoded to release in objwriter.proj - you need to set that to Debug as well to build the Debug version in the first place.

All the relocs you see should go through ObjectWriter::EmitSymbolRef (src\coreclr\tools\aot\ObjWriter\objwriter.cpp) so maybe there's something wrong in how the reloc is translated to LLVM terms there. I don't see anything obvious that would be producing 8 byte non-PC-relative relocs on ARM there, unless we're somehow ending up with a IMAGE_REL_BASED_DIR64 reloc (might want to set breakpoint on that just to be sure).

@kant2002
Copy link
Contributor Author

Debug version was fruitful.
I hit assertion for Symbol _ZTV6Object at
https://github.com/llvm/llvm-project/blob/bbea64250f65480d787e1c5ff45c4de3ec2dcda8/llvm/include/llvm/MC/MCFixup.h#L111
where Size == 0
Which is passed from ObjWriter at following location

const MCExpr *TargetExpr = GenTargetExpr(SymbolName, Kind, Delta, IsPCRel, Size);

From my understanding this is caused that C# pass RelocType = IMAGE_REL_BASED_THUMB_MOV32_PCREL which is not present in C++ side and thus not handled.

@kant2002
Copy link
Contributor Author

So far I look at the NativeAOT branch to check where IMAGE_REL_BASED_THUMB_MOV32 was processed and where processing of IMAGE_REL_BASED_REL_THUMB_MOV32_REL was missing. I understand that both should not go together, but in some places they have identical processing so I decide to at least record these sessions. Some places seems a bit fishy to me.

Loc 1

const NativeImageDumper::EnumMnemonics s_RelocType[] =
{
#define REL_ENTRY(x) NativeImageDumper::EnumMnemonics( x, 0, W(#x))
REL_ENTRY(IMAGE_REL_BASED_ABSOLUTE),
REL_ENTRY(IMAGE_REL_BASED_HIGHLOW),
REL_ENTRY(IMAGE_REL_BASED_DIR64),
REL_ENTRY(IMAGE_REL_BASED_THUMB_MOV32),
#undef REL_ENTRY

If this is Windows only that make sense, or at least maybe this is Win ARM only.

Loc 2

{
case RelocType.IMAGE_REL_BASED_THUMB_MOV32:
case RelocType.IMAGE_REL_BASED_THUMB_BRANCH24:
case RelocType.IMAGE_REL_BASED_ARM64_BRANCH26:
case RelocType.IMAGE_REL_BASED_ARM64_PAGEBASE_REL21:
case RelocType.IMAGE_REL_BASED_ARM64_PAGEOFFSET_12A:
case RelocType.IMAGE_REL_BASED_ARM64_PAGEOFFSET_12L:
unsafe

Not sure that this location may benefit from adding one more case.

Loc 3

case RelocType::IMAGE_REL_BASED_THUMB_MOV32: {

I think I would like add one case here, but maybe I should copy something like this
case RelocType::IMAGE_REL_BASED_RELPTR32:
Size = 4;
IsPCRel = true;
Delta += 4;
break;

Loc 4


First impression is to add one more case here, but with my limited knowledge it can be and case above:
case RelocType.IMAGE_REL_SYMBOL_SIZE:
EmitInt(delta);

Loc 5

case RelocType.IMAGE_REL_BASED_THUMB_MOV32:

This is most likely proper handling, but include this for completeness.

Loc 6

case IMAGE_REL_BASED_THUMB_MOV32:

If I understand properly this is for Win AMR32 and I should ignore this.

This seems to be location which trigger creation of reloc which crash ObjWriter.

if (emitComp->opts.jitFlags->IsSet(JitFlags::JIT_FLAG_RELATIVE_CODE_RELOCS))
{
emitRecordRelocation(location, target, IMAGE_REL_BASED_REL_THUMB_MOV32_PCREL);
}
else
{
emitRecordRelocation(location, target, IMAGE_REL_BASED_THUMB_MOV32);
}

@MichalStrehovsky
Copy link
Member

I think the relocation is generated by RyuJIT in "Loc 6" you identified above because we set the JIT_FLAG_RELATIVE_CODE_RELOCS here:

if (targetArchitecture == TargetArchitecture.ARM && !_compilation.TypeSystemContext.Target.IsWindows)
flags.Set(CorJitFlag.CORJIT_FLAG_RELATIVE_CODE_RELOCS);

This was added when Samsung was fixing crossgen2 for Linux ARM32 here: dotnet/runtime#33153

I'm not sure we need it for NativeAOT. Samsung was running CoreRT on ARM32 in the past. So it might be worth trying to put those two lines under #if READYTORUN.

The other option (if ifdeffing causes issues at runtime) is to pipe the new relocation through to the object writer. It will go somewhere along the lines of dotnet/corert#3433.

@jkotas
Copy link
Member

jkotas commented Aug 12, 2021

CORJIT_FLAG_RELATIVE_CODE_RELOCS should not be set for NativeAOT ARM32.

@kant2002
Copy link
Contributor Author

Now I have another problem and code fails with assertion here.

Expression: I != M+Size && I->FromReg == RegNum && "Invalid RegNum"

This from here
https://github.com/llvm/llvm-project/blame/2c99246624b1a6f6673a613fe5a5548e5f2b7a87/llvm/lib/MC/MCRegisterInfo.cpp#L89
But unfortunately this code is not the latest now with changes introduced in llvm/llvm-project@aaff1a6
Should I try apply this commit locally, and if that works modify llvm.patch in objwriter?

@MichalStrehovsky
Copy link
Member

I wouldn't consider it an LLVM bug just yet.

What's the stack for the assert? Is it generating unwind info or debug info (or something else?)?

@kant2002
Copy link
Contributor Author

Here the stack trace. Reason why I thinking that is LLVM is RegNum == 81 and isEH == true.

But when looking at EmitObject DependencyNode which is processing during crash is method System.Reflection.Runtime.General.ParseConstantNumericValue.NativeFormatMetadataReaderExtensions you may be right and I'm quick to conclusion.

Stack trace

objwriter.dll!llvm::MCRegisterInfo::getLLVMRegNum(unsigned int RegNum, bool isEH) Line 87
	at runtimelab\artifacts\llvm\lib\MC\MCRegisterInfo.cpp(87)
objwriter.dll!ObjectWriter::EmitARMExIdxPerOffset() Line 1069
	at runtimelab\src\coreclr\tools\aot\ObjWriter\objwriter.cpp(1069)
objwriter.dll!ObjectWriter::EmitARMExIdxCode(int Offset, const char * Blob) Line 1103
	at runtimelab\src\coreclr\tools\aot\ObjWriter\objwriter.cpp(1103)
objwriter.dll!EmitARMExIdxCode(ObjectWriter * OW, int Offset, const char * Blob) Line 393
	at runtimelab\src\coreclr\tools\aot\ObjWriter\objwriter.h(393)
[Managed to Native Transition]
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.EmitARMExIdxCode(int nativeOffset, byte[] blob) Line 300
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(300)
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.EmitCFICodes(int offset) Line 723
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(723)
ILCompiler.Compiler.dll!ILCompiler.DependencyAnalysis.ObjectWriter.EmitObject(string objectFilePath, System.Collections.Generic.IEnumerable<ILCompiler.DependencyAnalysisFramework.DependencyNode> nodes, ILCompiler.DependencyAnalysis.NodeFactory factory, ILCompiler.DependencyAnalysis.ObjectWritingOptions options, ILCompiler.DependencyAnalysis.IObjectDumper dumper) Line 1054
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs(1054)
ILCompiler.RyuJit.dll!ILCompiler.RyuJitCompilation.CompileInternal(string outputFile, ILCompiler.ObjectDumper dumper) Line 75
	at runtimelab\src\coreclr\tools\aot\ILCompiler.RyuJit\Compiler\RyuJitCompilation.cs(75)
ILCompiler.Compiler.dll!ILCompiler.Compilation.ILCompiler.ICompilation.Compile(string outputFile, ILCompiler.ObjectDumper dumper) Line 503
	at runtimelab\src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\Compilation.cs(503)
ilc.dll!ILCompiler.Program.Run(string[] args) Line 757
	at runtimelab\src\coreclr\tools\aot\ILCompiler\Program.cs(757)
ilc.dll!ILCompiler.Program.Main(string[] args) Line 928
	at runtimelab\src\coreclr\tools\aot\ILCompiler\Program.cs(928)

@MichalStrehovsky
Copy link
Member

The RegNum feels pretty high, but I don't actually know.

The blob objectwriter is interpreting here is generated by RyuJIT - somewhere in unwindarm.cpp.

@kant2002
Copy link
Contributor Author

The blob objectwriter is interpreting here is generated by RyuJIT - somewhere in unwindarm.cpp.

Unwind and ARM now I feel scary. Will take a look.
Also can you re-open. I do not think that runtime can close this issue.

@MichalStrehovsky
Copy link
Member

If you add: --codegenopt:NGenDump=NativeFormatMetadataReaderExtensions to the ILC command line (and use Debug RyuJIT), it will print out a bunch of info about the method it compiled, including unwinding info.

I would step through EmitARMExIdxCode comparing with what RyuJIT wrote in the debug output to see if anything is amyss.

@MichalStrehovsky
Copy link
Member

Once you're ready to debug RyuJIT, don't forget to pass --singlethreaded to ILC to make your life much easier.

@kant2002
Copy link
Contributor Author

So insisting on closing the issue. Now I know that somebody has a lot of permissions.

@kant2002
Copy link
Contributor Author

Here my observations
LLVM use following mapping table to map ARM register to DWARF registers

// ARM Dwarf<->LLVM register mappings.
extern const MCRegisterInfo::DwarfLLVMRegPair ARMDwarfFlavour0Dwarf2L[] = {
  { 0U, ARM::R0 },
  { 1U, ARM::R1 },
  { 2U, ARM::R2 },
  { 3U, ARM::R3 },
  { 4U, ARM::R4 },
  { 5U, ARM::R5 },
  { 6U, ARM::R6 },
  { 7U, ARM::R7 },
  { 8U, ARM::R8 },
  { 9U, ARM::R9 },
  { 10U, ARM::R10 },
  { 11U, ARM::R11 },
  { 12U, ARM::R12 },
  { 13U, ARM::SP },
  { 14U, ARM::LR },
  { 15U, ARM::PC },
  { 256U, ARM::D0 },
  { 257U, ARM::D1 },
  { 258U, ARM::D2 },
  { 259U, ARM::D3 },
  { 260U, ARM::D4 },
  { 261U, ARM::D5 },
  { 262U, ARM::D6 },
  { 263U, ARM::D7 },
  { 264U, ARM::D8 },
  { 265U, ARM::D9 },
  { 266U, ARM::D10 },
  { 267U, ARM::D11 },
  { 268U, ARM::D12 },
  { 269U, ARM::D13 },
  { 270U, ARM::D14 },
  { 271U, ARM::D15 },
  { 272U, ARM::D16 },
  { 273U, ARM::D17 },
  { 274U, ARM::D18 },
  { 275U, ARM::D19 },
  { 276U, ARM::D20 },
  { 277U, ARM::D21 },
  { 278U, ARM::D22 },
  { 279U, ARM::D23 },
  { 280U, ARM::D24 },
  { 281U, ARM::D25 },
  { 282U, ARM::D26 },
  { 283U, ARM::D27 },
  { 284U, ARM::D28 },
  { 285U, ARM::D29 },
  { 286U, ARM::D30 },
  { 287U, ARM::D31 },
};

But in ObjWriter there +64 shift instead of +256

case Triple::arm: // fall through
case Triple::armeb: // fall through
case Triple::thumb: // fall through
case Triple::thumbeb:
switch (static_cast<RegNumArm>(RegNum)) {
case RegNumArm::REGNUM_R0: return 0;
case RegNumArm::REGNUM_R1: return 1;
case RegNumArm::REGNUM_R2: return 2;
case RegNumArm::REGNUM_R3: return 3;
case RegNumArm::REGNUM_R4: return 4;
case RegNumArm::REGNUM_R5: return 5;
case RegNumArm::REGNUM_R6: return 6;
case RegNumArm::REGNUM_R7: return 7;
case RegNumArm::REGNUM_R8: return 8;
case RegNumArm::REGNUM_R9: return 9;
case RegNumArm::REGNUM_R10: return 10;
case RegNumArm::REGNUM_R11: return 11;
case RegNumArm::REGNUM_R12: return 12;
case RegNumArm::REGNUM_SP: return 13;
case RegNumArm::REGNUM_LR: return 14;
case RegNumArm::REGNUM_PC: return 15;
// fp registers
default:
return RegNum - static_cast<int>(RegNumArm::REGNUM_COUNT) + 64;

ARM docs say that 256 corresponding to D0-D31 and S0-S31 is obsolete now.
https://developer.arm.com/documentation/ihi0040/c/?lang=en#dwarf-register-names

I remember that I saw somewhere one more mapping registered to DWarf numbers and I saw 81 there and it was corresponding to registeres, but cannot find right now.

Speculations
So from outside seems that ARM32 support stale at some point and did not catch up to latest LLVM movements, or something else. In the source code there a lot of reference to S0-S3 registers, and seems to be newer register do not accounted for (not sure yet). Maybe right now there no ARM32, but rather ARM64/32 which use 64bit floating point registers even in 32-bit mode (?).

@kant2002
Copy link
Contributor Author

Okay. I find that magic place. it was unwindarm.cpp.

case REG_F17:
dwarfReg = 81;
break;

And I still see mixed signals. I notice that there F0-F31 registeres for ARM32, at the same time in the docs which I mention previously it only form F0-F7 so where F8-F31 comes from. Based on DWARF reg mapping this is S0-S31 registers. But LLVM want new registers D0-D31 from VFP-fp3, and in order to make it works, I have to check what's emitted and switch from emit for Sx registers to Dx registers . Or look on possibility to tweak LLVM to accept these registers in DWARF because seems to be codegen is working and only DWARF gen is breaks.

@kant2002
Copy link
Contributor Author

And regarding DWARF numbers. I see that LLVM produce only registers 0-15 and 256+ for ARM

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMRegisterInfo.td#L79

See that for ARM64 they produce and registers in range 64-XXX

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64RegisterInfo.td#L260

I suspect small patch to LLVM can fix that, if I annotate ARMRegisterInfo.td with DwarfRegNum for 64+ registers. Not sure if this would fly, or not.

@jkotas
Copy link
Member

jkotas commented Aug 14, 2021

RyuJIT should be only generating push/pop of the D0-D31 registers. Look for "Our calling convention requires that we only use vpush for TYP_DOUBLE registers" in CodeGen::genPushFltRegs. I do not think you need to be modifying LLVM for this. I think you just need to fix Compiler::unwindPushMaskFloat or something near it to emit the right unwind codes based on D0-D31 registers directly instead of breaking it down to S0-S31 registers.

@kant2002
Copy link
Contributor Author

Not sure if I do in proper direction. What I done:

  1. I emit CFI only for even FP registers, i.e S0, S2, S4, etc.
  2. I change DWARF reg numbers to start with 256 for FP registers.
diff --git a/src/coreclr/jit/unwind.cpp b/src/coreclr/jit/unwind.cpp
index 14dc49f50fa..2064819d29c 100644
--- a/src/coreclr/jit/unwind.cpp
+++ b/src/coreclr/jit/unwind.cpp
@@ -187,7 +187,12 @@ void Compiler::unwindPushPopMaskCFI(regMaskTP regMask, bool isFloat)
     regMaskTP regBit = isFloat ? genRegMask(REG_FP_FIRST) : 1;
 
     for (regNumber regNum = isFloat ? REG_FP_FIRST : REG_FIRST; regNum < REG_COUNT;
-         regNum           = REG_NEXT(regNum), regBit <<= 1)
+#if TARGET_ARM
+         regNum = isFloat ? ((regNumber)((unsigned)(regNum) + 2)) : REG_NEXT(regNum),
+#else
+         regNum = REG_NEXT(regNum),
+#endif
+         regBit <<= 1)
     {
         if (regBit > regMask)
         {
diff --git a/src/coreclr/jit/unwindarm.cpp b/src/coreclr/jit/unwindarm.cpp
index e26d6e008f0..da4219b1d53 100644
--- a/src/coreclr/jit/unwindarm.cpp
+++ b/src/coreclr/jit/unwindarm.cpp
@@ -71,100 +71,52 @@ short Compiler::mapRegNumToDwarfReg(regNumber reg)
             dwarfReg = 15;
             break;
         case REG_F0:
-            dwarfReg = 64;
-            break;
-        case REG_F1:
-            dwarfReg = 65;
+            dwarfReg = 256;
             break;
         case REG_F2:
-            dwarfReg = 66;
-            break;
-        case REG_F3:
-            dwarfReg = 67;
+            dwarfReg = 257;
             break;
         case REG_F4:
-            dwarfReg = 68;
-            break;
-        case REG_F5:
-            dwarfReg = 69;
+            dwarfReg = 258;
             break;
         case REG_F6:
-            dwarfReg = 70;
-            break;
-        case REG_F7:
-            dwarfReg = 71;
+            dwarfReg = 259;
             break;
         case REG_F8:
-            dwarfReg = 72;
-            break;
-        case REG_F9:
-            dwarfReg = 73;
+            dwarfReg = 260;
             break;
         case REG_F10:
-            dwarfReg = 74;
-            break;
-        case REG_F11:
-            dwarfReg = 75;
+            dwarfReg = 261;
             break;
         case REG_F12:
-            dwarfReg = 76;
-            break;
-        case REG_F13:
-            dwarfReg = 77;
+            dwarfReg = 262;
             break;
         case REG_F14:
-            dwarfReg = 78;
-            break;
-        case REG_F15:
-            dwarfReg = 79;
+            dwarfReg = 263;
             break;
         case REG_F16:
-            dwarfReg = 80;
-            break;
-        case REG_F17:
-            dwarfReg = 81;
+            dwarfReg = 264;
             break;
         case REG_F18:
-            dwarfReg = 82;
-            break;
-        case REG_F19:
-            dwarfReg = 83;
+            dwarfReg = 265;
             break;
         case REG_F20:
-            dwarfReg = 84;
-            break;
-        case REG_F21:
-            dwarfReg = 85;
+            dwarfReg = 266;
             break;
         case REG_F22:
-            dwarfReg = 86;
-            break;
-        case REG_F23:
-            dwarfReg = 87;
+            dwarfReg = 267;
             break;
         case REG_F24:
-            dwarfReg = 88;
-            break;
-        case REG_F25:
-            dwarfReg = 89;
+            dwarfReg = 268;
             break;
         case REG_F26:
-            dwarfReg = 90;
-            break;
-        case REG_F27:
-            dwarfReg = 91;
+            dwarfReg = 269;
             break;
         case REG_F28:
-            dwarfReg = 92;
-            break;
-        case REG_F29:
-            dwarfReg = 93;
+            dwarfReg = 270;
             break;
         case REG_F30:
-            dwarfReg = 94;
-            break;
-        case REG_F31:
-            dwarfReg = 95;
+            dwarfReg = 271;
             break;
         default:
             noway_assert(!"unexpected REG_NUM");

This produce error in

case ICorDebugInfo::VLT_FPSTK:
case ICorDebugInfo::VLT_FIXED_VA:
assert(false && "Unsupported varloc type!");

I look around and notice that there mapping between registers and DWARF numbers in the same file, so I do quick hack to check if that helps.

diff --git a/src/coreclr/tools/aot/ObjWriter/debugInfo/dwarf/dwarfGen.cpp b/src/coreclr/tools/aot/ObjWriter/debugInfo/dwarf/dwarfGen.cpp
index bc2bac9728e..396f03c70d9 100644
--- a/src/coreclr/tools/aot/ObjWriter/debugInfo/dwarf/dwarfGen.cpp
+++ b/src/coreclr/tools/aot/ObjWriter/debugInfo/dwarf/dwarfGen.cpp
@@ -193,7 +193,7 @@ static int GetDwarfRegNum(Triple::ArchType ArchType, int RegNum) {
         case RegNumArm::REGNUM_PC:  return 15;
         // fp registers
         default:
-          return RegNum - static_cast<int>(RegNumArm::REGNUM_COUNT) + 64;
+          return (RegNum - static_cast<int>(RegNumArm::REGNUM_COUNT)) / 2 + 256;
       }
     case Triple::aarch64: // fall through
     case Triple::aarch64_be:

That indeed helps, anyway something in that spirit should be changed. Now it fail when handling ICorDebugInfo::VLT_STK

Streamer->EmitIntValue(DwarfBaseRegNum + dwarf::DW_OP_breg0, 1);

with Expression: (isUIntN(8 * Size, Value) || isIntN(8 * Size, Value)) && "Invalid size".
Based on how it is called I think issue with the fact that I change reg numbers to be more then 256 which does not fit into single byte.

The more I look at line Streamer->EmitIntValue(DwarfBaseRegNum + dwarf::DW_OP_breg0, 1); the more I puzzled.
How that line can work with floating point registers. I get that ARM is kind of neglected, but how ARM64 was working. They have +64 offset for FP registers, so let's say first Floating Point register is 64, then if I add dwarf::DW_OP_breg0 (0x70) then fancy way to produce dwarf::DW_OP_bregXX produce invalid results, because it is incorrect encoding of DWARF operation (DW_OP_xxx). Please correct me if I'm missing some hidden logic there.

@kant2002
Copy link
Contributor Author

I think for registers with number greater then 31 I should use dwarf::DW_OP_bregx and dwarf::DW_OP_regx operators.

kant2002 added a commit to kant2002/runtimelab that referenced this issue Aug 15, 2021
This change has 2 parts, one which is JIT part and likely require submission to runtime.
I keep that change here to give full scope of changes. Will extract if everything would be fine.
JIT for ARM now start numering floating point registers starting from 256

Second part is DWARF generation
In order to produce DWARF I made following changes
- Account for register numbers more then 31, by use DW_OP_bregx and DW_OP_regx for large regnum
- Combine VLT_FPSTK and VLT_STK handling.

See dotnet#1388
kant2002 added a commit to kant2002/runtime that referenced this issue Aug 15, 2021
After introduction of VFP-v3 ARM S0-S31 no longer can be generated using LLVM because  numbering of registers to start from 256 and only D0-D31 are used.
So this change encode S0 as D0, S2 as D1, etc. Also use reg nums for DXX registers.
This change fix generation of CFI codes,
which trigger issue with generation of DWARF using LLVM in NativeAOT
See https://developer.arm.com/documentation/ihi0040/c/?lang=en#dwarf-register-names
See dotnet/runtimelab#1388
jkotas pushed a commit to dotnet/runtime that referenced this issue Aug 16, 2021
After introduction of VFP-v3 ARM S0-S31 no longer can be generated using LLVM because  numbering of registers to start from 256 and only D0-D31 are used.
So this change encode S0 as D0, S2 as D1, etc. Also use reg nums for DXX registers.
This change fix generation of CFI codes,
which trigger issue with generation of DWARF using LLVM in NativeAOT
See https://developer.arm.com/documentation/ihi0040/c/?lang=en#dwarf-register-names
See dotnet/runtimelab#1388
@jkotas
Copy link
Member

jkotas commented Aug 21, 2021

Fixed by #1413

@jkotas jkotas closed this as completed Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants