Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] 'compiled-objects-have-debug-symbols' false positive on mach-o files that have been passed through 'strip' #235

Open
jameslamb opened this issue May 8, 2024 · 4 comments · May be fixed by #310
Labels
bug Something isn't working

Comments

@jameslamb
Copy link
Owner

What did you expect to happen?

Expected compiled-objects-have-debug-symbols check to only find debug symbols if a mach-o binary contains them.

What actually happened?

Over in LightGBM, observed that when upgrading scikit-build-core for a shared library from v0.4.4 to 0.9.3, pydistcheck started raising this on the resulting library lib_lightgbm.dylib.

1. [compiled-objects-have-debug-symbols] Found compiled object containing debug symbols. For details, extract the distribution contents and run 'nm -a "lightgbm/lib/lib_lightgbm.dylib"'.
errors found while checking: 1

But ONLY that macOS library... not the equivalent files on linux (lib_lightgbm.so) or windows (lib_lightgbm.dll).

How can someone else reproduce this problem?

I mention that scikit-build-core version because in that range, scikit-build-core added the ability to strip the binaries it produced.

https://github.com/scikit-build/scikit-build-core/blob/2994db5d993f0cb252dbeebfac7d66677b670ed8/README.md#L278-279

I built that library (lib_lightgbm.dylib) on an M2 mac, in a build driven by scikit-build-core, like this:

git clone \
    --recursive \
    git@github.com:microsoft/LightGBM.git \
    /tmp/lightgbm

cd /tmp/lightgbm
sh build-python.sh bdist_wheel
unzip -d ./tmp ./dist/*.whl
nm ./lightgbm/lib/lib_lightgbm.dylib > o.txt
nm -a ./lightgbm/lib/lib_lightgbm.dylib > a.txt

Then diffed those files. Only saw 1 difference.

image

The results of nm -a includes this:

0000000005614542 - 00 0000   OPT radr://5614542

It seems that maybe this is added by strip?

I found this repo, https://github.com/nico/lssym, which says

(Compare with the output of `nm lssym` and `dsymutil -s lssym` in both cases.
Note that `strip` adds a symbol that looks like

    radr://5614542    N_STAB 3c    n_sect 000 n_desc 0x0000    n_value 0x5614542

I wonder what that bug is. `nm` doesn't list it because `nm` only lists symbols,
and this is a stabs debug info entry which `nm` only shows if you pass `-a`.)

I strongly suspect that's what's happening here! Not sure what the best fix is though for pydistcheck, will need to think through that.

What version of pydistcheck are you using?

0.6.0

Notes

Thanks @nico for putting up https://github.com/nico/lssym! It was a great clue to help me investigate this!!

@agriyakhetarpal
Copy link

Looks like this is a special, corner case that might be possible to ignore safely, because Xcode's strip prefers to keep weak symbols like , see: https://x.com/yaakov_h/status/478656504162029568/photo/1

Unfortunately, the corresponding Apple Open Source pages have been wiped into oblivion. I tried

and I naïvely thought I could look them up on archival services such as https://archive.is/opensource.apple.com or on the Internet Archive's Wayback Machine, but to no avail – it is apparent that Apple has probably worked aggressively to stop the spread of this knowledge.

It is a remnant of the linker, though, so maybe using alternative linkers such as Zld will avoid this problem.

Even on this page: https://docs.angr.io/projects/cle/en/v9.2.70/_modules/cle/backends/macho/symbol.html, the sentence seems incomplete:

# The addr isn't really an address, but a magic value that is used to indicate that the symbol is

I don't claim any conspiracies here, though. 😃

IDA 7.6 seems to ignore it as well: https://docs.hex-rays.com/release-notes/7_6#file-formats

@agriyakhetarpal
Copy link

I managed to find a link that still works: https://archive.is/7voAu

@jameslamb
Copy link
Owner Author

jameslamb commented Dec 28, 2024

Wow AMAZING investigation!!! Thank you so much for looking into that.

Are you interested in submitting a PR that ignores this specific symbol when nm / llvm-nm are used on mach-o files here? If not, no worries, I'd be happy to attempt it.

I think it'd involve filtering that out around here:

def _nm_reports_debug_symbols(tool_name: str, lib_file: str) -> Tuple[bool, str]:
exported_symbols = _get_symbols(cmd_args=[tool_name], lib_file=lib_file)
all_symbols = _get_symbols(cmd_args=[tool_name, "-a"], lib_file=lib_file)
return exported_symbols != all_symbols, f"{tool_name} -a"

with a code comment pointing back to this issue to explain why

@agriyakhetarpal
Copy link

Thank you for the pointer! Yes, I'm interested in contributing and will put together a PR this week :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants