-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module addresses loading as relative rather than absolute #198
Comments
Huh, so that suggests that 0.0.19 was succeeding in matching the module by name, but 0.0.20 is failing to match by build ID. Could you please provide the output of running your reproducer with this patch: diff --git a/libdrgn/linux_kernel.c b/libdrgn/linux_kernel.c
index 90c2aa8..e921c6a 100644
--- a/libdrgn/linux_kernel.c
+++ b/libdrgn/linux_kernel.c
@@ -1309,6 +1309,11 @@ report_loaded_kernel_module(struct drgn_debug_info_load_state *load,
err);
}
+ fprintf(stderr, "module %s has build ID ", kmod_it->name);
+ for (size_t i = 0; i < key.len; i++)
+ fprintf(stderr, "%02" PRIx8, ((uint8_t *)key.str)[i]);
+ fputc('\n', stderr);
+
struct hash_pair hp = kernel_module_table_hash(&key);
struct kernel_module_table_iterator it =
kernel_module_table_search_hashed(kmod_table, &key, hp); and the output of running |
Ok, here we go:
And for the drgn output with the patch:
|
Clearly they don't match :/ |
Do you have the loadable |
I am chasing after the .ko file for this build. It looks like it was a local build, so I think it could be possible that an older version got preserved or overwritten. I got the .ko and .ko.debug for a production build on a similar version, and I see that the Build ID's match there, so there's no systematic build ID mismatch in our system. This may be user error rather than a bug. I'll update when I have both files. I do wonder though, is this the behavior when a mismatched build ID is loaded? Is it just loaded as if it were another vmlinux debuginfo? I wonder if it would be better to just raise an error, skip, or at least require a "force=True" argument to load a mismatched build ID? I can see somebody saying "i know what I'm doing" and being right, but in the general case, it seems like a very confusing error case. I'd rather get an exception during loading, than get a weird error down the line. Alternatively, if there were a way to view the loaded debuginfo files, and see which ones were matched to which modules, then I'd imagine we could introspect this a bit better. Just spitballing ideas though, I'm really unfamiliar with this part of the code... |
Currently, "leftover" files given to But you're right, when this isn't intentional, it causes this really surprising breakage. The debug info discovery rework I'm working on should help address this. Essentially, I'm adding APIs for asking drgn "what modules do you think are loaded in this core dump?", then for each module, specifying what files to use for each module, and fabricating modules for cases where drgn either gets it wrong or you want to do something weird like my |
Awesome! I wish there was something I could do to help with the debuginfo discovery rework, it sounds great. Will there be a possibility to specify another search path for debuginfo? For example, we frequently extract RPMs, or at least the contents of Still waiting on that .ko file, but I'd really be shocked if there was anything other than a mismatch here. |
For what it's worth, a gross quicky for getting the build ID without patching drgn :) def get_buildid(module):
for i in range(module.notes_attrs.notes):
if module.notes_attrs.attrs[i].attr.name.string_() == b".note.gnu.build-id":
return module.prog_.read(module.notes_attrs.attrs[i].private.value_(), 36)[-20:].hex() |
The .ko seems to be unrecoverable at this point. I'm going to close the issue because I haven't observed a build ID mismatch in our production kernels and, if there were, that would be our bug, not drgn's :) |
I think @sdimitro asked for this before and I opened #17 for it, so I'll definitely take that into account. I don't imagine that |
Correct, there's no |
Oh, in that case, it's probably finding it by build ID under |
Wow, I had no idea that That said, while our "-debuginfo" packages install the
And I think modules get placed alongside vmlinux. I guess we could make some scripts to maintain a .build-id directory ourselves, I think it would just be a few lines of wrangling readelf and ln. My expectation had been that drgn's behavior was more like "check in these well-known paths for a directory named after |
Hi Omar, this may be because of incomplete testing I did back on #178... but I'm not certain. I had a team member report that after loading a kernel module's debuginfo, module symbols became relative to their load address, rather than absolute. Here's what I mean:
I also am reminded of #185... I tried out your module-percpu-hack branch and it also gives me the address=0x120 value.
The text was updated successfully, but these errors were encountered: