Skip to content

Remove ElfReader #if ALPINE dependence #34314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

sdmaclea
Copy link
Contributor

The cross DAC wants to be build for only Alpine or Linux not both.

Remove the ElfReader's compile time dependence on TARGET_ALPINE_LINUX

Use a simple heuristic to determine whether we need to add the loadbias.

If all else fails look for a well known symbol g_dacTable in the symbol hash
to determine whether to add the loadbias. (Should be rare.)

Fixes #32756

Copy link
Member

@hoyosjs hoyosjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will post the rest of the review later after some more thought on the range checking.

@hoyosjs
Copy link
Member

hoyosjs commented Apr 2, 2020

And macros shoot at us again 😄 You include windows.h which defines macros that are confusing the tokenizer after the preprocessor walked through the file. The easies way to solve this is to use either

    uint64_t minAddr = std::min<uint64_t>(std::min<uint64_t>((uint64_t)m_gnuHashTableAddr,
                                         (uint64_t)m_stringTableAddr),
                                         (uint64_t)m_symbolTableAddr);

or

    uint64_t minAddr = (std::min)((std::min)((uint64_t)m_gnuHashTableAddr,
                                         (uint64_t)m_stringTableAddr),
                                         (uint64_t)m_symbolTableAddr);

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Apr 2, 2020

Thanks @hoyosjs

The cross DAC wants to be build for only Alpine or Linux not both.

Remove the ElfReader's compile time dependence on TARGET_ALPINE_LINUX

Use a simple heuristic to determine whether we need to add the loadbias.

If all else fails look for a well known symbol g_dacTable in the symbol hash
to determine whether to add the loadbias. (Should be rare.)
@sdmaclea sdmaclea force-pushed the AlpineDac branch 4 times, most recently from ea37308 to 783c6bd Compare April 11, 2020 00:08
Copy link
Contributor

@mikem8361 mikem8361 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting way to complicated for a simple elf reader. Isn't there any other way to do this? Some way to determine at runtime the distro?

@sdmaclea
Copy link
Contributor Author

isn't so suppose to cache any state in the class because of the way it is used in createdump

OK. That wasn't obvious to me.

It is relatively easy to refactor it your way.

This is getting way to complicated for a simple elf reader
Isn't there any other way to do this?
Some way to determine at runtime the distro?

I have struggled with similar thoughts. The potential to use this to read Linux and Alpine dumps in a single build feels appealing especially for SOS and dotnet-dump.

The simpler model would be to treat this as two cases

  1. At least one pointer is less than the start of the Elf module in memory - add loadbias.
  2. Other cases - no add loadbias.

It is not a perfect model, but it is highly likely to work in all the cases we care about. I have been tempted to strip it down to this for a while.

I'm going to refactor it to meet requirements...

}
else
{
// We cannot conclusively determine whether we need to add the load bias
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still leaning to this approach.

  • This case is theoretically possible, but difficult to test because it has not occurred.
  • I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.
  • ReadElfMemory is also not terribly complicated.
  • ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There are alternatives.

  • We could simply return false; in this case. Revert unnecessary code.
  • Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.
  • Fallback to an #ifdef LINUX_ALPINE for this case.
  • Abandon the possibility of dynamically supporting Linux & Alpine.
  • Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

Copy link
Contributor Author

@sdmaclea sdmaclea Apr 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In preference order, these are the leading options IMO

  1. This approach.
  2. Simply return false;
  3. Abandon the possibility of dynamically supporting Linux & Alpine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.

If the ranges are correct, yeah. The only way that if could fail would be a symbol falls in an address in a PT_LOAD gap. This seems unlikely.

  • ReadElfMemory is also not terribly complicated.
  • ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There's not that many cases I can think of where we'd get a dump corrupted in such a way that we'd end reading in one of those regions (we'd need a corrupt .symtab, or a missing segment). That being said ReadElfMemory doesn't feel like the complicated piece. However, it still relies on the stored state that Mike said (I still don't know why. A comment might be warranted...)?

The other part I like of this change is not calling populate and then trying to get the symbol. Felt a little of - remember to do this. Tends to be a step that's easy to forget.

  • We could simply return false; in this case. Revert unnecessary code.
  • Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.

I am still not sure 176-178 are correct, so I am not sure the condition on 188 (the comment in that branch should be (greater than loadsize + loadbias) given that you are using the vaddr in the calculations). I expect 182-187 to be the heavily predominant branch on musl based distros, while the second one should be the base for glibc based distros (not sure what about bionic's libc)

  • Fallback to an #ifdef LINUX_ALPINE for this case.
  • Abandon the possibility of dynamically supporting Linux & Alpine.

I (really) like the possibility of dropping a couple of dll's, but I was wondering what it'd buy us on the long run. I don't know if I am lacking experience or prior art here, but I don't see much of an issue creating a separate one. Or if it warrants this complexity

  • Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

This one is nice (no guessing) but rarely a good option in my opinion. The DAC will need to now gain more information of the environment it runs in, and so will createdump and such.

Overall, I would say I don't mind a heuristic that much. I just don't know if:

  • 176 is correct.
  • An idea of how likely it is to SEGV trying of reading memory (definitely a LOT less now that there's a segment constrained read).
  • What does this buy us in the end in exchange for the risk.

// Rather than making a compile time decision based on OS, we try to find a required symbol.
// Try to find "g_dacTable" in the hash table w/o adding the loadbias
uint64_t symbolOffset;
if (!InitializeGnuHashTable() || !TryLookupSymbol(symbolName, &symbolOffset)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this potentially SEGV?

}
else
{
// We cannot conclusively determine whether we need to add the load bias
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I believe the hardening of the symbol reader, ReadElfMemory, makes this code reasonably safe.

If the ranges are correct, yeah. The only way that if could fail would be a symbol falls in an address in a PT_LOAD gap. This seems unlikely.

  • ReadElfMemory is also not terribly complicated.
  • ReadElfMemory also may provide other benefits in cases of corrupt dumps or elf files.

There's not that many cases I can think of where we'd get a dump corrupted in such a way that we'd end reading in one of those regions (we'd need a corrupt .symtab, or a missing segment). That being said ReadElfMemory doesn't feel like the complicated piece. However, it still relies on the stored state that Mike said (I still don't know why. A comment might be warranted...)?

The other part I like of this change is not calling populate and then trying to get the symbol. Felt a little of - remember to do this. Tends to be a step that's easy to forget.

  • We could simply return false; in this case. Revert unnecessary code.
  • Ignore the possibility of this case. Drop lines 176-178 & 188-208. Drop other unnecessary code.

I am still not sure 176-178 are correct, so I am not sure the condition on 188 (the comment in that branch should be (greater than loadsize + loadbias) given that you are using the vaddr in the calculations). I expect 182-187 to be the heavily predominant branch on musl based distros, while the second one should be the base for glibc based distros (not sure what about bionic's libc)

  • Fallback to an #ifdef LINUX_ALPINE for this case.
  • Abandon the possibility of dynamically supporting Linux & Alpine.

I (really) like the possibility of dropping a couple of dll's, but I was wondering what it'd buy us on the long run. I don't know if I am lacking experience or prior art here, but I don't see much of an issue creating a separate one. Or if it warrants this complexity

  • Define a mechanism for the hosting layer to inform the elf-reader the type of dump.

This one is nice (no guessing) but rarely a good option in my opinion. The DAC will need to now gain more information of the environment it runs in, and so will createdump and such.

Overall, I would say I don't mind a heuristic that much. I just don't know if:

  • 176 is correct.
  • An idea of how likely it is to SEGV trying of reading memory (definitely a LOT less now that there's a segment constrained read).
  • What does this buy us in the end in exchange for the risk.

uint64_t minAddr = std::min<uint64_t>(std::min<uint64_t>((uint64_t)m_gnuHashTableAddr,
(uint64_t)m_stringTableAddr),
(uint64_t)m_symbolTableAddr);
uint64_t maxAddr = std::max<uint64_t>(std::max<uint64_t>((uint64_t)m_gnuHashTableAddr + sizeof(GnuHashTable) + sizeof(int32_t),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you also need to account for the size of the symbol table as well? Also, what's the additional int32 at the end of the GNU hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have all the size info. So this is approximate. It is underestimating the maxAddr. The hash is a hash table. A series of buckets and a series of chains. The sizeof(int) was the absolute minimum number of buckets. It probably grossly underestimates the maxAddr.

@sdmaclea
Copy link
Contributor Author

@mikem8361 @hoyosjs I'd like to get consensus on this.

  • There is a lot of complexity added to save a few minutes of outerloop build time.
    • It would be nice to have fewer cross DAC builds, but in the end it won't save much
    • I am comfortable just building for Alpine separately

If we still think it is valuable to pursue Alpine cross DAC == Linux Cross DAC...

The patch is complicated by a special case which hasn't occurred in practice

  • There is a lot of complexity added for a case which doesn't occur in reality.
  • Since it doesn't occur in practice we don't have any code coverage.

Should we close the PR or simplify the patch

@mikem8361
Copy link
Contributor

Yes, I agree that it adds a lot of complexity to save building a couple more DACs. I think we should close the PR and save the changes for a rainy day :)

@sdmaclea sdmaclea closed this May 15, 2020
@sdmaclea sdmaclea deleted the AlpineDac branch September 26, 2020 16:52
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove cross OS DAC dependence on TARGET_ALPINE_LINUX
4 participants