Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In multithreaded programs,some threads "Verifying safety for pid xxxxx" FAILED ! #46

Open
chenzhbao629 opened this issue Oct 16, 2019 · 7 comments

Comments

@chenzhbao629
Copy link

Verifying safety for pid 88521...
Stacktrace to verify safety for pid 88521:
[0x7fa5c32aea3d] __poll_nocancel+0x24
[0x7fa5c9d3a11f] fdset_event_dispatch+0x6f
[0x7fa5c9d3b270] rte_vhost_driver_session_start+0x10
[0x559bb0d2860b] _init+0x176513
kpatch_patch.c(201): safety check failed for 7fa5c9d39fb0

kpatch_patch.c(497): Patching xxxx.so failed, unapplying partially applied patch

Finished ptrace detaching.Failed to apply patch './libshared.kpatch'
kpatch_patch.c(588): Failed to apply patch './libshared.kpatch'

@paboldin
Copy link
Contributor

paboldin commented Oct 22, 2019

Did libcare write you the function for which the safety verification failed? If not, this alone is certainly a bug.

For some applications there is always a loop function that loops forever until application exits. It is always on the stack thus, and can't be properly patched.

Whether it is your case would be a lot easier to say by looking at the patch. If you can share the patch please do it here.

At least, please provide the full log.

@chenzhbao629
Copy link
Author

Thanks paboldin, just like what you say:For some applications there is always a loop function that loops forever until application exits. It is always on the stack thus, and can't be properly patched.

there is a thread always on the loop
so we tried to stop the thread, using: kill -STOP pid, then we tyied to patch again, but failed ,

attached to 15 thread(s): 74232, 74233, 74234, 74235, 74236, 74237, 74238, 74239, 74253, 74257, 74258, 74259, 74260, 74261, 74262
Loading patch info 'librte_vhost.so.3'...nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
nononoeeeeeeeeee------- 'librte_vhost.so.3'...
sddddddheeeeeeeeee------- 'librte_vhost.so.3'...successfully, 4 entries
kpatch_patch.c(361): Patching 361, ------------------------------------
Counting undefined symbols:
Undefined symbol 'rte_mem_virt2phy'
Undefined symbol 'get_device'
Undefined symbol 'reset_device'
Undefined symbol 'rte_malloc'
Undefined symbol '__assert_fail@@GLIBC_2.2.5'
Undefined symbol 'cleanup_device'
Undefined symbol 'rte_zmalloc'
Undefined symbol 'close@@GLIBC_2.2.5'
Undefined symbol 'rte_mempool_ops_table'
Undefined symbol 'read@@GLIBC_2.2.5'
Undefined symbol 'rte_malloc_socket'
Undefined symbol '__tls_get_addr@@GLIBC_2.3'
Undefined symbol '__fxstat64@@GLIBC_2.2.5'
Undefined symbol 'alloc_vring_queue_pair'
Undefined symbol 'memcpy@@GLIBC_2.14'
Undefined symbol 'read_fd_message'
Undefined symbol 'rte_free'
Undefined symbol 'rte_log'
Undefined symbol 'eventfd_write@@GLIBC_2.7'
Undefined symbol 'VHOST_FEATURES'
Undefined symbol 'per_lcore__lcore_id'
Undefined symbol 'mmap64@@GLIBC_2.2.5'
Undefined symbol 'vhost_devices'
Undefined symbol 'malloc@@GLIBC_2.2.5'
Undefined symbol 'get_mempolicy'
Undefined symbol 'munmap@@GLIBC_2.2.5'
Undefined symbol 'madvise@@GLIBC_2.2.5'
Undefined symbol 'memmove@@GLIBC_2.2.5'
Jump table 464 bytes for 28 syms at offset 0x11730
Looking for patch region for 'librte_vhost.so.3'...
Found patch region for 'librte_vhost.so.3' at 7f1d71559000
mmap_remote: 0x7f1d71559000+12000, 7, 32, -1, 0
Executing syscall 9 (pid 74232)...
wait_for_stop(pctx->pid=74232, pid=74232)
allocated 0x12000 bytes at 0x9 for 'librte_vhost.so.3' patch
Marking this space as busy
kpatch_patch.c(389): Patching 389, ------------------------------------
Resolving sections' addresses for 'librte_vhost.so.3'
section '.note.gnu.build-id' = 0x7f1d705ad1c8
section '.gnu.hash' = 0x7f1d705ad1f0
section '.dynsym' = 0x7f1d705ad2a0
section '.dynstr' = 0x7f1d705ad930
section '.gnu.version' = 0x7f1d705adde4
section '.gnu.version_r' = 0x7f1d705adb38
section '.rela.dyn' = 0x7f1d705adfd0
section '.rela.plt' = 0x7f1d705ae2d0
section '.init' = 0x7f1d705ae708
section '.rela.init' = 0x0
section '.plt' = 0x7f1d705ae730
section '.text' = 0x7f1d705aea10
section '.rela.text' = 0x0
section '.kpatch.text' = 0x449
section '.rela.kpatch.text' = 0xf6a9
section '.fini' = 0x7f1d705bdfe0
section '.rodata' = 0x7f1d705bdff0
section '.rela.rodata' = 0x0
section '.kpatch.strtab' = 0xd7e6
section '.kpatch.info' = 0xd864
section '.rela.kpatch.info' = 0x10ae9
section '.eh_frame_hdr' = 0x7f1d705bf3dc
section '.eh_frame' = 0x7f1d705bf570
section '.rela.eh_frame' = 0x0
section '.init_array' = 0x7f1d707c0c60
section '.rela.init_array' = 0x0
section '.fini_array' = 0x7f1d707c0c68
section '.rela.fini_array' = 0x0
section '.jcr' = 0x7f1d707c0c70
section '.data.rel.ro' = 0x7f1d707c0c80
section '.rela.data.rel.ro' = 0x0
section '.dynamic' = 0x7f1d707c0d48
section '.got' = 0x7f1d707c0fb8
section '.got.plt' = 0x7f1d707c1000
section '.data' = 0x7f1d707c1180
section '.tm_clone_table' = 0x7f1d707cb108
section '.kpatch.data' = 0xd941
section '.rela.kpatch.data' = 0x10c09
section '.bss' = 0x7f1d707cf200
section '.comment' = 0x0
section '.shstrtab' = 0xdba7
section '.symtab' = 0xe811
section '.strtab' = 0xf2f1
Resolving symbols for 'librte_vhost.so.3'
symbol 'rte_vhost_update_totalpkts.kpatch' is defined and global, we don't check for overrition
symbol 'rte_vhost_update_totalpkts.kpatch' = 0x449
symbol 'rte_vhost_dequeue_burst.kpatch' is defined and global, we don't check for overrition
symbol 'rte_vhost_dequeue_burst.kpatch' = 0x95c9
symbol 'rte_vhost_dequeue_burst' is defined and global, we don't check for overrition
symbol 'rte_vhost_dequeue_burst' = 0x7f1d705b82c0
symbol '__assert_fail' = 0x7f1d69a642d0
jmptable '__assert_fail' = 0x11749
symbol 'close' = 0x7f1d69b1fe80
jmptable 'close' = 0x11759
symbol 'rte_vhost_update_totalpkts' is defined and global, we don't check for overrition
symbol 'rte_vhost_update_totalpkts' = 0x7f1d705af100
symbol 'vhost_user_msg_handler.kpatch' is defined and global, we don't check for overrition
symbol 'vhost_user_msg_handler.kpatch' = 0xc469
symbol 'read' = 0x7f1d69b1f7d0
jmptable 'read' = 0x11769
symbol '__tls_get_addr' = 0x7f1d714a2400
jmptable '__tls_get_addr' = 0x11779
symbol 'rte_vhost_enqueue_burst.kpatch' is defined and global, we don't check for overrition
symbol 'rte_vhost_enqueue_burst.kpatch' = 0x4b9
symbol '__fxstat64' = 0x7f1d69b1f140
jmptable '__fxstat64' = 0x11789
symbol 'vhost_user_msg_handler' is defined and global, we don't check for overrition
symbol 'vhost_user_msg_handler' = 0x7f1d705bb270
Executing callrax 7f1d69ac4920 (pid 74232)
wait_for_stop(pctx->pid=74232, pid=74232)
symbol 'memcpy' = 0x7f1d69b80cd0
jmptable 'memcpy' = 0x11799
symbol 'notify_ops' is defined and global, we don't check for overrition
symbol 'notify_ops' = 0x7f1d707cf208
symbol 'eventfd_write' = 0x7f1d69b2e640
jmptable 'eventfd_write' = 0x117a9
symbol 'mmap64' = 0x7f1d69b28960
jmptable 'mmap64' = 0x117b9
symbol 'malloc' = 0x7f1d69ab60c0
jmptable 'malloc' = 0x117c9
symbol 'munmap' = 0x7f1d69b28a20
jmptable 'munmap' = 0x117d9
symbol 'madvise' = 0x7f1d69b28ae0
jmptable 'madvise' = 0x117e9
Executing callrax 7f1d69abf710 (pid 74232)
wait_for_stop(pctx->pid=74232, pid=74232)
symbol 'memmove' = 0x7f1d69b86270
jmptable 'memmove' = 0x117f9
symbol 'rte_vhost_enqueue_burst' is defined and global, we don't check for overrition
symbol 'rte_vhost_enqueue_burst' = 0x7f1d705af170
kpatch_patch.c(393): Patching 393, ------------------------------------
Applying relocations for 'librte_vhost.so.3'...
applying relocations to '.kpatch.text'
applying relocations to '.kpatch.info'
applying relocations to '.kpatch.data'
kpatch_patch.c(397): Patching 397, ------------------------------------
kpatch_patch.c(400): Patching 400, ------------------------------------
kpatch_patch.c(497): Patching librte_vhost.so.3 failed, unapplying partially applied patch
Verifying safety for pid 74232...
Stacktrace to verify safety for pid 74232:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x55cff5add826] _init+0x14672e
[0x55cff5ac1fda] _init+0x12aee2
[0x55cff599b5d9] _init+0x44e1
[0x7f1d69a57c05] __libc_start_main+0xf5
[0x55cff599c32d] _init+0x5235
[0x0]
OK
Verifying safety for pid 74233...
Stacktrace to verify safety for pid 74233:
[0x7f1d6a55298d] __accept_nocancel+0x24
[0x7f1d6ea3b618] rte_thread_setname+0xde8
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74234...
Stacktrace to verify safety for pid 74234:
[0x7f1d69b2e923] __epoll_wait_nocancel+0x2a
[0x7f1d6ea3ea14] rte_exit+0x5e4
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74235...
Stacktrace to verify safety for pid 74235:
[0x7f1d6a552b03] __recvfrom_nocancel+0x2a
[0x7f1d707d33fa]
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74236...
Stacktrace to verify safety for pid 74236:
[0x7f1d69af51ad] __nanosleep_nocancel+0x24
[0x7f1d69af5044] sleep+0xd4
[0x55cff5ae934f] _init+0x152257
[0x55cff5b11316] _init+0x17a21e
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74237...
Stacktrace to verify safety for pid 74237:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x7f1d705af11f] fdset_event_dispatch+0x6f
[0x7f1d705b0270] rte_vhost_driver_session_start+0x10
[0x55cff5b0d60b] _init+0x176513
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74238...
Stacktrace to verify safety for pid 74238:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x55cff5add826] _init+0x14672e
[0x55cff5ac1fda] _init+0x12aee2
[0x55cff5aa4eeb] _init+0x10ddf3
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74239...
Stacktrace to verify safety for pid 74239:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x55cff5add826] _init+0x14672e
[0x55cff5ac1fda] _init+0x12aee2
[0x55cff5b343e4] _init+0x19d2ec
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74253...
Stacktrace to verify safety for pid 74253:
[0x7f1d69af51ad] __nanosleep_nocancel+0x24
[0x7f1d69af5044] sleep+0xd4
[0x7f1d705afa0e] vhost_user_client_reconnect+0x16e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74257...
Stacktrace to verify safety for pid 74257:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x55cff5add826] _init+0x14672e
[0x55cff5ac1fda] _init+0x12aee2
[0x55cff59d84e1] _init+0x413e9
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74258...
Stacktrace to verify safety for pid 74258:
[0x7f1d69b23a3d] __poll_nocancel+0x24
[0x55cff5add826] _init+0x14672e
[0x55cff5ac1fda] _init+0x12aee2
[0x55cff59d8f90] _init+0x41e98
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74259...
Stacktrace to verify safety for pid 74259:
[0x55cff5a0f54b] _init+0x78453
[0x55cff5a0f8da] _init+0x787e2
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74260...
Stacktrace to verify safety for pid 74260:
[0x7f1d70e0e5f3]
[0x55cff5b11cc9] _init+0x17abd1
[0x55cff5a3eae1] _init+0xa79e9
[0x55cff5a0f566] _init+0x7846e
[0x55cff5a0f8da] _init+0x787e2
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74261...
Stacktrace to verify safety for pid 74261:
[0x55cff5a0f578] _init+0x78480
[0x55cff5a0f8da] _init+0x787e2
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
Verifying safety for pid 74262...
Stacktrace to verify safety for pid 74262:
[0x55cff5a0f975] _init+0x7887d
[0x55cff5aa6786] _init+0x10f68e
[0x7f1d6a54be25] start_thread+0xc5
[0x7f1d69b2e34d] clone+0x6d
[0x0]
OK
kpatch_patch.c(246): Patching 246, -----------safety safety safety------------------------
munmap_remote: 0x9+11730
Executing syscall 11 (pid 74232)...
wait_for_stop(pctx->pid=74232, pid=74232)
kpatch_patch.c(504): Can't unapply patch for librte_vhost.so.3

Detaching from 74232...OK
Detaching from 74233...OK
Detaching from 74234...OK
Detaching from 74235...OK
Detaching from 74236...OK
Detaching from 74237...OK
Detaching from 74238...OK
Detaching from 74239...OK
Detaching from 74253...OK
Detaching from 74257...OK
Detaching from 74258...OK
Detaching from 74259...OK
Detaching from 74260...OK
Detaching from 74261...OK
Detaching from 74262...OK
Finished ptrace detaching.Failed to apply patch './libshared.kpatch'
kpatch_patch.c(588): Failed to apply patch './libshared.kpatch'

@paboldin
Copy link
Contributor

It is impossible to patch the function that executes main loop because it is always on the stack and there is (almost) no way to patch this correctly.

We apply a patch to function by re-writing its first instructions with a jmp to the patched version of the function. If the function never exits it is pointless to do so, because it will never leave the code of loop and execute the patched version. Sending it a SIGSTOP won't help as it stops the application inside the event loop.

The easiest solution here is to patch one of the functions the event loop calls, if it is possible, and remove patch from the loop function, so there is no conflicts.

If you can -- simplify your patch to the point where it is only a single 'printf' statement in each block and provide it here, through git repo or privately via E-mail.

@chenzhbao629
Copy link
Author

chenzhbao629 commented Oct 24, 2019

Hi paboldin, thank you for your reply, but i have a question that why i need to "Verifying safety for pid xxxx... " , even the thread doesn't call the function which should be patched ?

@cloudlinux cloudlinux deleted a comment from chenzhbao629 Oct 24, 2019
@paboldin
Copy link
Contributor

I can't be sure that no thread calls the target function. From what I see the original comment the code thinks there is a target function on the stack.

If you can, please provide at least the following outputs, here or privately:
$ strings PATCHFILE
$ diff -u lib/librte_vhost/.kpatch_fd_manoriginal.s lib/librte_vhost/.kpatch_fd_manpatched.s

It will help me to see if there the patch really contains the target function.

Preferable, just show me the patch.

@chenzhbao629
Copy link
Author

chenzhbao629 commented Oct 24, 2019 via email

@paboldin
Copy link
Contributor

The files you've sent are not being displayed at github, please send them to me via boldin.pavel@gmail.com.

The problem here may be with the inlined functions. When code declares function as a static it becomes local to the file and the compiler can just insert the code in place of every call, to save function call overhead. This is usually the case when there is only one call for the function in the whole file.

This might be your case as well. But, again, I will need both the patch and the binary patch to be sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants