Remove ElfReader #if ALPINE dependence #34314

sdmaclea · 2020-03-31T02:44:01Z

The cross DAC wants to be build for only Alpine or Linux not both.

Remove the ElfReader's compile time dependence on TARGET_ALPINE_LINUX

Use a simple heuristic to determine whether we need to add the loadbias.

If all else fails look for a well known symbol g_dacTable in the symbol hash
to determine whether to add the loadbias. (Should be rare.)

Fixes #32756

hoyosjs

Will post the rest of the review later after some more thought on the range checking.

src/coreclr/src/debug/dbgutil/elfreader.cpp

hoyosjs · 2020-04-02T07:13:16Z

And macros shoot at us again 😄 You include windows.h which defines macros that are confusing the tokenizer after the preprocessor walked through the file. The easies way to solve this is to use either

    uint64_t minAddr = std::min<uint64_t>(std::min<uint64_t>((uint64_t)m_gnuHashTableAddr,
                                         (uint64_t)m_stringTableAddr),
                                         (uint64_t)m_symbolTableAddr);

or

    uint64_t minAddr = (std::min)((std::min)((uint64_t)m_gnuHashTableAddr,
                                         (uint64_t)m_stringTableAddr),
                                         (uint64_t)m_symbolTableAddr);

sdmaclea · 2020-04-02T16:29:20Z

Thanks @hoyosjs

The cross DAC wants to be build for only Alpine or Linux not both. Remove the ElfReader's compile time dependence on TARGET_ALPINE_LINUX Use a simple heuristic to determine whether we need to add the loadbias. If all else fails look for a well known symbol g_dacTable in the symbol hash to determine whether to add the loadbias. (Should be rare.)

Fail if symbol lookup reads memory outside of ELF

mikem8361

This is getting way to complicated for a simple elf reader. Isn't there any other way to do this? Some way to determine at runtime the distro?

src/coreclr/src/debug/dbgutil/elfreader.cpp

sdmaclea · 2020-04-11T02:16:51Z

isn't so suppose to cache any state in the class because of the way it is used in createdump

OK. That wasn't obvious to me.

It is relatively easy to refactor it your way.

This is getting way to complicated for a simple elf reader
Isn't there any other way to do this?
Some way to determine at runtime the distro?

I have struggled with similar thoughts. The potential to use this to read Linux and Alpine dumps in a single build feels appealing especially for SOS and dotnet-dump.

The simpler model would be to treat this as two cases

At least one pointer is less than the start of the Elf module in memory - add loadbias.
Other cases - no add loadbias.

It is not a perfect model, but it is highly likely to work in all the cases we care about. I have been tempted to strip it down to this for a while.

I'm going to refactor it to meet requirements...

sdmaclea · 2020-04-11T02:44:41Z

src/coreclr/src/debug/dbgutil/elfreader.cpp

+    }
+    else
+    {
+        // We cannot conclusively determine whether we need to add the load bias


I am still leaning to this approach.

This case is theoretically possible, but difficult to test because it has not occurred.

I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.

ReadElfMemory is also not terribly complicated.

ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There are alternatives.

We could simply return false; in this case. Revert unnecessary code.

Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.

Fallback to an #ifdef LINUX_ALPINE for this case.

Abandon the possibility of dynamically supporting Linux & Alpine.

Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

In preference order, these are the leading options IMO

This approach.

Simply return false;

Abandon the possibility of dynamically supporting Linux & Alpine.

I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.

If the ranges are correct, yeah. The only way that if could fail would be a symbol falls in an address in a PT_LOAD gap. This seems unlikely.

ReadElfMemory is also not terribly complicated.

ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There's not that many cases I can think of where we'd get a dump corrupted in such a way that we'd end reading in one of those regions (we'd need a corrupt .symtab, or a missing segment). That being said ReadElfMemory doesn't feel like the complicated piece. However, it still relies on the stored state that Mike said (I still don't know why. A comment might be warranted...)?

The other part I like of this change is not calling populate and then trying to get the symbol. Felt a little of - remember to do this. Tends to be a step that's easy to forget.

We could simply return false; in this case. Revert unnecessary code.

Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.

I am still not sure 176-178 are correct, so I am not sure the condition on 188 (the comment in that branch should be (greater than loadsize + loadbias) given that you are using the vaddr in the calculations). I expect 182-187 to be the heavily predominant branch on musl based distros, while the second one should be the base for glibc based distros (not sure what about bionic's libc)

Fallback to an #ifdef LINUX_ALPINE for this case.

Abandon the possibility of dynamically supporting Linux & Alpine.

I (really) like the possibility of dropping a couple of dll's, but I was wondering what it'd buy us on the long run. I don't know if I am lacking experience or prior art here, but I don't see much of an issue creating a separate one. Or if it warrants this complexity

Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

This one is nice (no guessing) but rarely a good option in my opinion. The DAC will need to now gain more information of the environment it runs in, and so will createdump and such.

Overall, I would say I don't mind a heuristic that much. I just don't know if:

176 is correct.

An idea of how likely it is to SEGV trying of reading memory (definitely a LOT less now that there's a segment constrained read).

What does this buy us in the end in exchange for the risk.

src/coreclr/src/debug/dbgutil/elfreader.cpp

hoyosjs · 2020-04-11T08:08:21Z

src/coreclr/src/debug/dbgutil/elfreader.cpp

+        // Rather than making a compile time decision based on OS, we try to find a required symbol.
+        // Try to find "g_dacTable" in the hash table w/o adding the loadbias
+        uint64_t symbolOffset;
+        if (!InitializeGnuHashTable() || !TryLookupSymbol(symbolName, &symbolOffset)) {


Can't this potentially SEGV?

hoyosjs · 2020-04-11T09:34:01Z

src/coreclr/src/debug/dbgutil/elfreader.cpp

+    }
+    else
+    {
+        // We cannot conclusively determine whether we need to add the load bias


I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.

If the ranges are correct, yeah. The only way that if could fail would be a symbol falls in an address in a PT_LOAD gap. This seems unlikely.

ReadElfMemory is also not terribly complicated.

ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There's not that many cases I can think of where we'd get a dump corrupted in such a way that we'd end reading in one of those regions (we'd need a corrupt .symtab, or a missing segment). That being said ReadElfMemory doesn't feel like the complicated piece. However, it still relies on the stored state that Mike said (I still don't know why. A comment might be warranted...)?

The other part I like of this change is not calling populate and then trying to get the symbol. Felt a little of - remember to do this. Tends to be a step that's easy to forget.

We could simply return false; in this case. Revert unnecessary code.

Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.

I am still not sure 176-178 are correct, so I am not sure the condition on 188 (the comment in that branch should be (greater than loadsize + loadbias) given that you are using the vaddr in the calculations). I expect 182-187 to be the heavily predominant branch on musl based distros, while the second one should be the base for glibc based distros (not sure what about bionic's libc)

Fallback to an #ifdef LINUX_ALPINE for this case.

Abandon the possibility of dynamically supporting Linux & Alpine.

I (really) like the possibility of dropping a couple of dll's, but I was wondering what it'd buy us on the long run. I don't know if I am lacking experience or prior art here, but I don't see much of an issue creating a separate one. Or if it warrants this complexity

Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

This one is nice (no guessing) but rarely a good option in my opinion. The DAC will need to now gain more information of the environment it runs in, and so will createdump and such.

Overall, I would say I don't mind a heuristic that much. I just don't know if:

176 is correct.

An idea of how likely it is to SEGV trying of reading memory (definitely a LOT less now that there's a segment constrained read).

What does this buy us in the end in exchange for the risk.

hoyosjs · 2020-04-11T09:36:47Z

src/coreclr/src/debug/dbgutil/elfreader.cpp

+    uint64_t minAddr = std::min<uint64_t>(std::min<uint64_t>((uint64_t)m_gnuHashTableAddr,
+                                                             (uint64_t)m_stringTableAddr),
+                                                             (uint64_t)m_symbolTableAddr);
+    uint64_t maxAddr = std::max<uint64_t>(std::max<uint64_t>((uint64_t)m_gnuHashTableAddr + sizeof(GnuHashTable) + sizeof(int32_t),


Don't you also need to account for the size of the symbol table as well? Also, what's the additional int32 at the end of the GNU hash?

I don't have all the size info. So this is approximate. It is underestimating the maxAddr. The hash is a hash table. A series of buckets and a series of chains. The sizeof(int) was the absolute minimum number of buckets. It probably grossly underestimates the maxAddr.

sdmaclea · 2020-05-15T01:58:39Z

@mikem8361 @hoyosjs I'd like to get consensus on this.

There is a lot of complexity added to save a few minutes of outerloop build time.
- It would be nice to have fewer cross DAC builds, but in the end it won't save much
- I am comfortable just building for Alpine separately

If we still think it is valuable to pursue Alpine cross DAC == Linux Cross DAC...

The patch is complicated by a special case which hasn't occurred in practice

There is a lot of complexity added for a case which doesn't occur in reality.
Since it doesn't occur in practice we don't have any code coverage.

Should we close the PR or simplify the patch

mikem8361 · 2020-05-15T02:13:17Z

Yes, I agree that it adds a lot of complexity to save building a couple more DACs. I think we should close the PR and save the changes for a rainy day :)

sdmaclea added this to the 5.0 milestone Mar 31, 2020

sdmaclea requested review from mikem8361 and hoyosjs March 31, 2020 02:44

sdmaclea self-assigned this Mar 31, 2020

Dotnet-GitSync-Bot added the area-Diagnostics-coreclr label Mar 31, 2020

sdmaclea force-pushed the AlpineDac branch 2 times, most recently from 7be5615 to 0dd0d2f Compare March 31, 2020 03:00

This was referenced Apr 1, 2020

Errors installing the SDK during builds #34015

Closed

Add retry to install script for most of the error dotnet/sdk#11001

Closed

hoyosjs reviewed Apr 1, 2020

View reviewed changes

src/coreclr/src/debug/dbgutil/elfreader.cpp Show resolved Hide resolved

mikem8361 approved these changes Apr 3, 2020

View reviewed changes

sdmaclea added 5 commits April 10, 2020 19:34

Restore dropped line

7fd0f6b

Add algorithm header

0120ffe

Use std::m..<uint64_t> to fix Windows compile issue

aa07ab4

Fix arm warning == error

87bacae

sdmaclea force-pushed the AlpineDac branch 4 times, most recently from ea37308 to 783c6bd Compare April 11, 2020 00:08

sdmaclea added 2 commits April 10, 2020 21:26

Harden ELF reader symbol parsing

44c5280

Fail if symbol lookup reads memory outside of ELF

Remove hardcoding of expected symbol g_dacTable

b3f6dc1

sdmaclea force-pushed the AlpineDac branch from 8d0f6a8 to b3f6dc1 Compare April 11, 2020 01:26

mikem8361 suggested changes Apr 11, 2020

View reviewed changes

src/coreclr/src/debug/dbgutil/elfreader.cpp Outdated Show resolved Hide resolved

Review feedback

83c5887

sdmaclea commented Apr 11, 2020

View reviewed changes

hoyosjs reviewed Apr 11, 2020

View reviewed changes

sdmaclea closed this May 15, 2020

sdmaclea mentioned this pull request May 16, 2020

Remove cross OS DAC dependence on TARGET_ALPINE_LINUX #32756

Closed

sdmaclea deleted the AlpineDac branch September 26, 2020 16:52

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Remove ElfReader #if ALPINE dependence #34314

Remove ElfReader #if ALPINE dependence #34314

Uh oh!

Conversation

sdmaclea commented Mar 31, 2020

Uh oh!

hoyosjs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hoyosjs commented Apr 2, 2020

Uh oh!

sdmaclea commented Apr 2, 2020

Uh oh!

mikem8361 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sdmaclea commented Apr 11, 2020

Uh oh!

sdmaclea Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

sdmaclea Apr 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hoyosjs Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hoyosjs Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

hoyosjs Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

hoyosjs Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

sdmaclea Apr 11, 2020

Choose a reason for hiding this comment

Uh oh!

sdmaclea commented May 15, 2020

Uh oh!

mikem8361 commented May 15, 2020

Uh oh!

Uh oh!

sdmaclea Apr 11, 2020 •

edited

Loading