Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

identify_address() misidentifies unreadable addresses #213

Closed
pfactum opened this issue Sep 26, 2022 · 8 comments
Closed

identify_address() misidentifies unreadable addresses #213

pfactum opened this issue Sep 26, 2022 · 8 comments

Comments

@pfactum
Copy link
Contributor

pfactum commented Sep 26, 2022

Hello.

The identify_address() seems to misjudge on some values.

Compare this (using identify_address()):

crush> rd -S 0xffffac9186247d68 10
 0xffffac9186247d68:  slab object: eventpoll_epi                ffffac9186247e00                     
 0xffffac9186247d78:  function symbol: mt7921_sta_ps+0x4        f998bf7908038700                     
 0xffffac9186247d88:  slab object: task_struct                  slab object: kmalloc-192             
 0xffffac9186247d98:  function symbol:                          function symbol: mt7921_sta_ps+0x0   
                      mt7921_mac_reset_work+0x5d                                                     
 0xffffac9186247da8:  function symbol: mt7921_sta_ps+0x1        ffffac9186247ef0

to this (using my custom demangle() helper from here):

crush> rd -S 0xffffac9186247d68 10
 0xffffac9186247d68:  eventpoll_epi:0xffff9cf90cc0dc18  ffffac9186247e00                 
 0xffffac9186247d78:  0000000000000004                  f998bf7908038700                 
 0xffffac9186247d88:  task_struct:0xffff9cf9067e4000    kmalloc-192:0xffff9d080661cd80   
 0xffffac9186247d98:  000000000000005d                  0000000000000000                 
 0xffffac9186247da8:  0000000000000001                  ffffac9186247ef0

Clearly, NULL pointer cannot be mt7921_sta_ps().

The way I work around this in demangle() is trying to read the address given, and if it results in FaultError, just do not try to demangle it.

What do you think?

Thanks.

@pfactum
Copy link
Contributor Author

pfactum commented Sep 26, 2022

Also mentioning @nhatsmrt as this was their commit.

@osandov
Copy link
Owner

osandov commented Oct 6, 2022

I think the real problem here is that drgn's symbol lookup here is finding mt7921_sta_ps for those addresses. It looks like that symbol comes from the mt7921-common kernel module. Did drgn find the .ko file automatically, or did you specify it manually? I wonder if this is a similar issue to #198, where the specified file is not quite the right one, so drgn falls back to loading it without relocating its addresses.

@pfactum
Copy link
Contributor Author

pfactum commented Oct 7, 2022

I did specify the file manually through prog.load_debug_info().

vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=b8e75948a9dbd3760f9d66e9d1ec47516dc3b12c, with debug_info, not stripped

mt7921-common.ko.debug: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), BuildID[sha1]=ed2e6c11b4ff8841a8810a1de83cb69ce2b2ba41, with debug_info, not stripped

vmcore: Kdump compressed dump v6, system Linux, node xxxxxx, release 4.18.0-372.16.1.el8_6.x86_64, version #1 SMP Tue Jun 28 03:02:21 EDT 2022, machine x86_64, domain (none)

It's a standard RHEL8 kernel, the debug symbols were taken straight from the kernel-debuginfo RPM, so those can't be incorrect.

@osandov
Copy link
Owner

osandov commented Oct 7, 2022

Can you try applying the patch from #198 (comment) so we can see what build ID drgn is seeing?

@pfactum
Copy link
Contributor Author

pfactum commented Oct 10, 2022

Applied, but drgn doesn't dump build ID for mt7921-common module.

I think I made a mistake here: I load debug symbols for this module even though it was not loaded on the system at the moment of crash. Would it be my error, or drgn is expected to handle this, but doesn't?

@pfactum
Copy link
Contributor Author

pfactum commented Oct 10, 2022

Confirmed, if I load debug symbols for loaded modules only, I get a correct output:

crush> rd -S 0xffffac9186247d68 10
 0xffffac9186247d68:  slab object: eventpoll_epi  ffffac9186247e00           
 0xffffac9186247d78:  0000000000000004            f998bf7908038700           
 0xffffac9186247d88:  slab object: task_struct    slab object: kmalloc-192   
 0xffffac9186247d98:  000000000000005d            0000000000000000           
 0xffffac9186247da8:  0000000000000001            ffffac9186247ef0

@osandov
Copy link
Owner

osandov commented Oct 11, 2022

Thanks for clearing that up. This behavior is a known problem; see #198 (comment). I'm working on a new API to fix this issue which should be done in the next couple of weeks.

@pfactum
Copy link
Contributor Author

pfactum commented Oct 12, 2022

OK then, thanks for checking this. Lets close this one in favour of #198.

@pfactum pfactum closed this as completed Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants