Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when drgn finishes with libkdumpfile #280

Closed
marxin opened this issue Feb 22, 2023 · 7 comments
Closed

Segmentation fault when drgn finishes with libkdumpfile #280

marxin opened this issue Feb 22, 2023 · 7 comments

Comments

@marxin
Copy link
Contributor

marxin commented Feb 22, 2023

Using a small vmcore from an openSUSE Leap 15.5, I get the following on my openSUSE Tumbleweed system:

$ drgn --version
drgn 0.0.22+72.g38b090e (using Python 3.10.9, elfutils 0.188, with libkdumpfile
$ python3 -m drgn --core vmcore x.py
warning: could not get debugging information for:
kernel (could not find vmlinux for 5.14.21-150500.37-default)
kernel modules (could not find loaded kernel modules: could not find 'struct module')
Segmentation fault (core dumped)
$ valgrind python3 -m drgn --core vmcore x.py
==5773== Invalid read of size 8
==5773==    at 0x5B68DF2: UnknownInlinedFun (attr.c:1082)
==5773==    by 0x5B68DF2: diskdump_attr_cleanup (diskdump.c:1042)
==5773==    by 0x5B6B39B: _kdumpfile_priv_attr_dict_free (attr.c:759)
==5773==    by 0x5B6B655: UnknownInlinedFun (kdumpfile-priv.h:1023)
==5773==    by 0x5B6B655: kdump_free (context.c:308)
==5773==    by 0x5A62F59: drgn_program_deinit (program.c:140)
==5773==    by 0x5A2790F: Program_dealloc (program.c:102)
==5773==    by 0x49AA416: UnknownInlinedFun (object.c:2301)
==5773==    by 0x49AA416: UnknownInlinedFun (object.h:500)
==5773==    by 0x49AA416: frame_dealloc (frameobject.c:591)
==5773==    by 0x49A1AE9: UnknownInlinedFun (object.c:2301)
==5773==    by 0x49A1AE9: UnknownInlinedFun (object.h:500)
==5773==    by 0x49A1AE9: _PyEval_Vector (ceval.c:5078)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:114)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:123)
==5773==    by 0x49A2A09: UnknownInlinedFun (ceval.c:5891)
==5773==    by 0x49A2A09: _PyEval_EvalFrameDefault (ceval.c:4213)
==5773==    by 0x49A18E2: UnknownInlinedFun (pycore_ceval.h:46)
==5773==    by 0x49A18E2: _PyEval_Vector (ceval.c:5065)
==5773==    by 0x4A1856F: PyEval_EvalCode (ceval.c:1134)
==5773==    by 0x4A1FEE6: UnknownInlinedFun (bltinmodule.c:1056)
==5773==    by 0x4A1FEE6: builtin_exec (bltinmodule.c.h:371)
==5773==    by 0x49ADB4A: cfunction_vectorcall_FASTCALL (methodobject.c:430)
==5773==  Address 0x6263be0 is 16 bytes inside a block of size 56 free'd
==5773==    at 0x484617B: free (vg_replace_malloc.c:884)
==5773==    by 0x5B6AA04: _kdumpfile_priv_dealloc_attr (attr.c:463)
==5773==    by 0x5B6AA04: _kdumpfile_priv_dealloc_attr (attr.c:463)
==5773==    by 0x5B6B35F: _kdumpfile_priv_attr_dict_free (attr.c:754)
==5773==    by 0x5B6B655: UnknownInlinedFun (kdumpfile-priv.h:1023)
==5773==    by 0x5B6B655: kdump_free (context.c:308)
==5773==    by 0x5A62F59: drgn_program_deinit (program.c:140)
==5773==    by 0x5A2790F: Program_dealloc (program.c:102)
==5773==    by 0x49AA416: UnknownInlinedFun (object.c:2301)
==5773==    by 0x49AA416: UnknownInlinedFun (object.h:500)
==5773==    by 0x49AA416: frame_dealloc (frameobject.c:591)
==5773==    by 0x49A1AE9: UnknownInlinedFun (object.c:2301)
==5773==    by 0x49A1AE9: UnknownInlinedFun (object.h:500)
==5773==    by 0x49A1AE9: _PyEval_Vector (ceval.c:5078)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:114)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:123)
==5773==    by 0x49A2A09: UnknownInlinedFun (ceval.c:5891)
==5773==    by 0x49A2A09: _PyEval_EvalFrameDefault (ceval.c:4213)
==5773==    by 0x49A18E2: UnknownInlinedFun (pycore_ceval.h:46)
==5773==    by 0x49A18E2: _PyEval_Vector (ceval.c:5065)
==5773==    by 0x4A1856F: PyEval_EvalCode (ceval.c:1134)
==5773==  Block was alloc'd at
==5773==    at 0x48485EF: calloc (vg_replace_malloc.c:1340)
==5773==    by 0x5B6A335: UnknownInlinedFun (attr.c:311)
==5773==    by 0x5B6A335: _kdumpfile_priv_new_attr (attr.c:488)
==5773==    by 0x5B6DB32: UnknownInlinedFun (attr.c:788)
==5773==    by 0x5B6DB32: kdump_new (context.c:177)
==5773==    by 0x5A6DA25: drgn_program_set_kdump (kdump.c:90)
==5773==    by 0x5A621AF: drgn_program_set_core_dump.part.0 (program.c:244)
==5773==    by 0x5A26F9C: Program_set_core_dump (program.c:400)
==5773==    by 0x49ADFB0: method_vectorcall_VARARGS_KEYWORDS (descrobject.c:344)
==5773==    by 0x49A2D86: UnknownInlinedFun (abstract.h:114)
==5773==    by 0x49A2D86: UnknownInlinedFun (abstract.h:123)
==5773==    by 0x49A2D86: UnknownInlinedFun (ceval.c:5891)
==5773==    by 0x49A2D86: _PyEval_EvalFrameDefault (ceval.c:4198)
==5773==    by 0x49A18E2: UnknownInlinedFun (pycore_ceval.h:46)
==5773==    by 0x49A18E2: _PyEval_Vector (ceval.c:5065)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:114)
==5773==    by 0x49A2A09: UnknownInlinedFun (abstract.h:123)
==5773==    by 0x49A2A09: UnknownInlinedFun (ceval.c:5891)
==5773==    by 0x49A2A09: _PyEval_EvalFrameDefault (ceval.c:4213)
==5773==    by 0x49A18E2: UnknownInlinedFun (pycore_ceval.h:46)
==5773==    by 0x49A18E2: _PyEval_Vector (ceval.c:5065)
==5773==    by 0x4A1856F: PyEval_EvalCode (ceval.c:1134)
$ gdb tells:
Program received signal SIGSEGV, Segmentation fault.
_kdumpfile_priv_attr_remove_override (override=0x55555576dac0, attr=0x55555576cdb0) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/attr.c:1090
1090		} while (tmpl->override);
(gdb) bt
#0  _kdumpfile_priv_attr_remove_override (override=0x55555576dac0, attr=0x55555576cdb0) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/attr.c:1090
#1  diskdump_attr_cleanup (dict=0x555555769f40) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/diskdump.c:1042
#2  0x00007ffff746939c in _kdumpfile_priv_attr_dict_free (dict=0x555555769f40) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/attr.c:759
#3  0x00007ffff7469656 in attr_dict_decref (dict=<optimized out>) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/kdumpfile-priv.h:1023
#4  kdump_free (ctx=0x555555769a10) at /usr/src/debug/libkdumpfile-0.5.1/src/kdumpfile/context.c:308
#5  0x00007ffff758af5a in drgn_program_deinit (prog=prog@entry=0x5555555dcf10) at ../../libdrgn/program.c:140
#6  0x00007ffff754f910 in Program_dealloc (self=0x5555555dcf00) at ../../libdrgn/python/program.c:102
#7  0x00007ffff7d29417 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2301
#8  _Py_DECREF (op=<optimized out>) at ./Include/object.h:500
#9  frame_dealloc (f=Frame 0x555555768670, for file /tmp/venv/lib64/python3.10/site-packages/drgn/cli.py, line 225, in _main (prefix='\x1b[33mwarning:\x1b[0m', script='x.py')) at Objects/frameobject.c:591
#10 0x00007ffff7d20aea in _Py_Dealloc (op=Frame 0x555555768670, for file /tmp/venv/lib64/python3.10/site-packages/drgn/cli.py, line 225, in _main (prefix='\x1b[33mwarning:\x1b[0m', script='x.py')) at Objects/object.c:2295
#11 _Py_DECREF (op=Frame 0x555555768670, for file /tmp/venv/lib64/python3.10/site-packages/drgn/cli.py, line 225, in _main (prefix='\x1b[33mwarning:\x1b[0m', script='x.py')) at ./Include/object.h:500
#12 _PyEval_Vector (tstate=<optimized out>, con=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=0, kwnames=<optimized out>) at Python/ceval.c:5078
#13 0x00007ffff7d21a0a in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=<optimized out>, callable=<function at remote 0x7ffff6f6cdc0>, tstate=0x555555577360) at ./Include/cpython/abstract.h:114
#14 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=<optimized out>, callable=<function at remote 0x7ffff6f6cdc0>) at ./Include/cpython/abstract.h:123
...
@marxin
Copy link
Contributor Author

marxin commented Feb 22, 2023

Please let me know if you want me to upload the vmcore and the corresponding debug info.

@marxin
Copy link
Contributor Author

marxin commented Feb 24, 2023

So what happens is that attr_dict_free(struct attr_dict *dict) is called:

void
attr_dict_free(struct attr_dict *dict)
{
	dealloc_attr(dgattr(dict, GKI_dir_root));

	if (dict->shared->arch_ops && dict->shared->arch_ops->attr_cleanup)
		dict->shared->arch_ops->attr_cleanup(dict);
	if (dict->shared->ops && dict->shared->ops->attr_cleanup)
		dict->shared->ops->attr_cleanup(dict);
...

where dealloc_attr(dgattr(dict, GKI_dir_root)); has if (attr->template->type == KDUMP_DIRECTORY) { true and thus a bunch of attributes are wiped. And dict->shared->ops->attr_cleanup(dict); calls: diskdump_attr_cleanup(struct attr_dict *dict) that does:

attr_remove_override(dgattr(dict, GKI_page_size),
   &ddp->page_size_override);

and apparently GKI_page_size is already cleaned up with the first call (dealloc_attr(dgattr(dict, GKI_dir_root));).
@ptesarik: Maybe something familiar?

@ptesarik
Copy link
Contributor

Yes, thanks for the reminder… 😩

Attributes must be reference-counted. I'll track this as a libkdumpfile issue.

@osandov
Copy link
Owner

osandov commented Feb 24, 2023

Thanks for digging into this!

@ptesarik is there something specific drgn is doing to trigger this, or something we can do to work around it? (I guess in the worst case, we could not call kdump_free())

@ptesarik
Copy link
Contributor

No, this is a real bug that can be also reproduced without drgn. I have just made a quick fix. There are still other bugs lurking in the code because of missing refcounting, but that was not needed to fix this issue.

@ptesarik
Copy link
Contributor

You may want to retest with ptesarik/libkdumpfile@97c716a.

@marxin
Copy link
Contributor Author

marxin commented Feb 25, 2023

Works for me, thank you Petr!

@marxin marxin closed this as completed Feb 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants