Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resiliency can be improved when facing constrained memory situations #5515

Open
bfaccini opened this issue Jan 5, 2023 · 4 comments
Open
Labels
DAOS DAOS-related Priority: 3 medium Type: Bug A previously unknown bug in PMDK
Milestone

Comments

@bfaccini
Copy link
Contributor

bfaccini commented Jan 5, 2023

Environment Information

  • PMDK package version(s): 1.12.1
  • OS(es) version(s): CentOS 7.9
  • ndctl version(s): ndctl-65-5
  • kernel version(s): 3.10.0-1160.45.1.el7.x86_64
  • compiler, libraries, packaging and other related tools version(s): DAOS v2.3

Please provide a reproduction of the bug:

Running DAOS Servers as a simple user with memory limits.

How often bug is revealed: (always, often, rare): rare

Actual behavior:

Silent abort() or SEGV, leading to unnecessary/wasted time spent to debug.

Expected behavior:

graceful handling.

Details

1st case/abort() encountered with a stack trace like following :

#0  0x00002aeb80a98387 in raise () from /lib64/libc.so.6
#1  0x00002aeb80a99a78 in abort () from /lib64/libc.so.6
#2  0x00002aeb803d48bc in out_get_errormsg () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#3  0x00002aeb803d4978 in out_err () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#4  0x00002aeb803d4fcc in ravl_emplace () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#5  0x00002aeb803f720e in pmemobj_tx_add_common.constprop.21 () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#6  0x00002aeb803f814c in pmemobj_tx_add_range_direct () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#7  0x00002aeb7ec7fadf in pmem_tx_add_ptr (umm=<optimized out>, ptr=<optimized out>, size=<optimized out>) at src/common/mem.c:133
#8  0x00002aeb7f4a0823 in umem_tx_add_ptr (size=8, ptr=0x2af703cd2bd8, umm=0x564a24f6fb80) at src/include/daos/mem.h:470
#9  ilog_ptr_set_full (lctx=lctx@entry=0x564a24f6fb30, dest=dest@entry=0x2af703cd2bd8, src=src@entry=0x2aedb9aa3490, len=len@entry=8) at src/vos/ilog.c:345
#10 0x00002aeb7f4a0b95 in update_inplace (lctx=0x564a24f6fb30, id_out=0x2af703cd2bd8, id_in=0x2aedb9aa3660, opc=1, is_equal=<optimized out>) at src/vos/ilog.c:661
#11 0x00002aeb7f4a11af in ilog_modify (loh=..., id_in=id_in@entry=0x2aedb9aa3660, epr=epr@entry=0x2aedb9aa35e0, opc=opc@entry=1) at src/vos/ilog.c:926
#12 0x00002aeb7f4a411e in ilog_persist (loh=..., id=id@entry=0x2aedb9aa3660) at src/vos/ilog.c:1017
#13 0x00002aeb7f485ef4 in dtx_ilog_rec_release (abort=96, dae=0x2aedbd922838, rec=<optimized out>, cont=0x0, umm=0x2aedbd922838) at src/vos/vos_dtx.c:561
#14 do_dtx_rec_release (umm=umm@entry=0x2aedbd5ae2b8, cont=cont@entry=0x2aedbd6efe20, dae=dae@entry=0x2aedbd922838, rec=<optimized out>, abort=abort@entry=false) at src/vos/vos_dtx.c:583
#15 0x00002aeb7f4861ba in do_dtx_rec_release (abort=false, rec=<optimized out>, dae=0x2aedbd922838, cont=0x2aedbd6efe20, umm=0x2aedbd5ae2b8) at src/vos/vos_dtx.c:578
#16 dtx_rec_release (cont=cont@entry=0x2aedbd6efe20, dae=0x2aedbd922838, abort=abort@entry=false) at src/vos/vos_dtx.c:688
#17 0x00002aeb7f48a02c in vos_dtx_commit_one (fatal=<synthetic pointer>, rm_cos=0x2aedbfe5745d, dae_p=0x2aedbfe92a58, dce_p=<synthetic pointer>, cmt_time=946528699681341440, epoch=0, dti=0x2aedbff49588, cont=0x2aedbd6efe20)
    at src/vos/vos_dtx.c:856
#18 vos_dtx_commit_internal (cont=cont@entry=0x2aedbd6efe20, dtis=dtis@entry=0x2aedbff47050, count=count@entry=512, epoch=epoch@entry=0, rm_cos=rm_cos@entry=0x2aedbfe572d0, daes=daes@entry=0x2aedbfe91df0, dces=dces@entry=0x2aedbff4a060)
    at src/vos/vos_dtx.c:1949
#19 0x00002aeb7f48d058 in vos_dtx_commit (coh=..., dtis=dtis@entry=0x2aedbff47050, count=count@entry=512, rm_cos=rm_cos@entry=0x2aedbfe572d0) at src/vos/vos_dtx.c:2140
#20 0x00002aeb8b944e97 in dtx_commit (cont=cont@entry=0x2aedbd6efa50, dtes=0x2aedbff2eef0, dcks=0x2aedbff43040, count=count@entry=512, epoch=epoch@entry=0) at src/dtx/dtx_rpc.c:813
#21 0x00002aeb8b950a2c in dtx_batched_commit_one (arg=0x2aedbd6f0d80) at src/dtx/dtx_common.c:518
#22 0x00002aeb7ff7f4aa in ABTD_ythread_func_wrapper () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/argobots/lib/libabt.so.1
#23 0x00002aeb7ff7f651 in make_fcontext () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/argobots/lib/libabt.so.1
#24 0x0000000000000000 in ?? ()

Further corefile analysis along with associated pmdk/libpmemobj source code review indicates that the abort()/FATAL() comes from the error-return handling from the malloc() call in Last_errormsg_get() :

 static inline struct errormsg *
Last_errormsg_get(void)
{
        Last_errormsg_key_alloc();

        struct errormsg *errormsg = os_tls_get(Last_errormsg_key);
        if (errormsg == NULL) {
                errormsg = malloc(sizeof(struct errormsg));
                if (errormsg == NULL)
                        FATAL("!malloc");
                /* make sure it contains empty string initially */
                errormsg->msg[0] = '\0';
                int ret = os_tls_set(Last_errormsg_key, errormsg);
                if (ret)
                        FATAL("!os_tls_set");
        }
        return errormsg;
}
"src/core/out.c" [readonly] line 69 of 598 --11%-- col 3-10

and more source code analysis also indicates that this behaviour could be avoided if these allocation/initialisation stuff could be done upstream during out_init(). What do you think ??

2nd case/SEGV encountered with a stack trace like following :

#0  0x00002b6a9e82c42f in ucs_debug_backtrace_next (bckt=0x0, line=line@entry=0x2b950edea8c0) at debug/debug.c:495
#1  0x00002b6a9e82c6a7 in ucs_debug_print_backtrace (stream=0x2b6a98b3a1c0 <_IO_2_1_stderr_>, strip=strip@entry=2) at debug/debug.c:656
#2  0x00002b6a9e82e945 in ucs_handle_error (message=message@entry=0x2b6a9e8f0d27 "tkill(2) or tgkill(2)") at debug/debug.c:1081
#3  0x00002b6a9e82ed0c in ucs_debug_handle_error_signal (signo=signo@entry=11, cause=0x2b6a9e8f0d27 "tkill(2) or tgkill(2)", fmt=fmt@entry=0x2b6a9e8f0f8b " at address %p") at debug/debug.c:1033
#4  0x00002b6a9e82ef7b in ucs_error_signal_handler (signo=11, info=0x2b950edead30, context=<optimized out>) at debug/debug.c:1055
#5  <signal handler called>
#6  0x00002b6a9810266c in palloc_defer_free_create () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#7  0x00002b6a981090a1 in pmemobj_tx_xfree () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/pmdk/lib/libpmemobj.so.1
#8  0x00002b6a9698fbae in pmem_tx_free (umm=0x2b6d155ae2b8, umoff=72350928) at src/common/mem.c:92
#9  0x00002b6a971960e2 in dtx_rec_release (cont=cont@entry=0x2b6d156efe20, dae=0x2b6d174b4d60, abort=abort@entry=true) at src/vos/vos_dtx.c:665
#10 0x00002b6a9719d9ea in vos_dtx_abort (coh=..., dti=0x2b6d17f87cf0, epoch=epoch@entry=946528706794618887) at src/vos/vos_dtx.c:2204
#11 0x00002b6aa3654aea in dtx_abort (cont=cont@entry=0x2b6d156efa50, dte=dte@entry=0x2b6d17f87cf0, epoch=946528706794618887) at src/dtx/dtx_rpc.c:896
#12 0x00002b6aa3664382 in dtx_leader_end (dlh=0x2b6d17f87cf0, coh=<optimized out>, result=<optimized out>) at src/dtx/dtx_common.c:1290
#13 0x00002b6aa3d9d114 in ds_obj_rw_handler (rpc=0x2b6d17db1db0) at src/object/srv_obj.c:2732
#14 0x00002b6a96ed1f68 in crt_handle_rpc (arg=0x2b6d17db1db0) at src/cart/crt_rpc.c:1654
#15 0x00002b6a97c8f4aa in ABTD_ythread_func_wrapper () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/argobots/lib/libabt.so.1
#16 0x00002b6a97c8f651 in make_fcontext () from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../prereq/release/argobots/lib/libabt.so.1
#17 0x00002b950ecec000 in ?? ()
#18 0x0000000000100000 in ?? ()
#19 0x00002b6a96ed1f20 in ?? () at src/cart/crt_rpc.c:663 from /work2/08126/dbohninx/frontera/BUILDS/daos-11940_pr10837/20221116/daos/install/bin/../lib64/libcart.so.4
#20 0x00002b6d17db1db0 in ?? ()
#21 0x00002b950edebfe0 in ?? ()
#22 0x00002b950edebfe0 in ?? ()
#23 0x000055e1455968d0 in ?? ()
#24 0x000055e14503f5f0 in spi_key_cmp (htable=<optimized out>, rlink=<optimized out>, key=<optimized out>, len=<optimized out>) at src/engine/sched.c:289
#25 0x0000000000000000 in ?? ()

Further corefile analysis along with associated pmdk/libpmemobj source code review indicates that the SEGV has occurred due to a 0xffffffffffffff80 invalid action pointer being returned by tx_action_add() and to be passed as 3rd parameter to palloc_defer_free() in pmemobj_tx_xfree(), and looking more into the associated code (in core/out.c, libpmemobj/palloc.c, common/mem.c, libpmemobj/tx.c, common/vec.h), this should be caused by the fact tx->actions->buffer value is NULL due to a previous Realloc() error in vec_reserve().
For this issue, I believe this could be handled by testing the VEC_INC_BACK() negative return value in tx_action_add() and thus have this latter to return NULL in this case instead of an invalid pointer. What do you think ?

I would be happy to push a PR for both issues if you agree with my analysis, just let me know.

Additional information about Priority and Help Requested:

Are you willing to submit a pull request with a proposed change? Yes

Requested priority: Medium

@bfaccini bfaccini added the Type: Bug A previously unknown bug in PMDK label Jan 5, 2023
@pbalcer
Copy link
Member

pbalcer commented Jan 10, 2023

  1. This structure is allocated on the first call for the thread. I don't think it's possible to preallocate it in out_init unless you specifically know the number of threads that will ever call this function. Maybe we should have a statically allocated special const string that we return from the Last_errormsg_get in case of an error?
  2. Yes, checking the return value of VEC_INC_BACK in the tx_action_add should solve the problem.

Unfortunately, there are also other places where we use VEC_PUSH_BACK without proper error-checking. Most of them are relatively easy to fix, but there's one where there's a potential error in the tx_commit path (where we cannot fail, and crashing is likely preferable).

operation_add_user_buffer(tx->lane->external, userbuf);
VEC_PUSH_BACK(&ctx->next, buffer_offset);

I think the fix here is to preallocate the next vector in tx_construct_user_buffer, somewhere around here:
if (VEC_PUSH_BACK(&tx->redo_userbufs, userbuf) != 0)

@bfaccini
Copy link
Contributor Author

  1. This structure is allocated on the first call for the thread. I don't think it's possible to preallocate it in out_init unless you specifically know the number of threads that will ever call this function. Maybe we should have a statically allocated special const string that we return from the Last_errormsg_get in case of an error?

Ah yes, I forgot to check if out_init() was also executed in the context of each thread !!... I will try to find an other place in the early steps of each thread.

  1. Yes, checking the return value of VEC_INC_BACK in the tx_action_add should solve the problem.

So, should I at least fix this one this way ?

Unfortunately, there are also other places where we use VEC_PUSH_BACK without proper error-checking. Most of them are relatively easy to fix, but there's one where there's a potential error in the tx_commit path (where we cannot fail, and crashing is likely preferable).

I will have a look to these other "simple" cases and to fix them the same way.

operation_add_user_buffer(tx->lane->external, userbuf);

VEC_PUSH_BACK(&ctx->next, buffer_offset);

I think the fix here is to preallocate the next vector in tx_construct_user_buffer, somewhere around here:

if (VEC_PUSH_BACK(&tx->redo_userbufs, userbuf) != 0)

Sorry, I don't get this. Is this related to the very special case you stated in the "tx_commit path" ??

@pbalcer
Copy link
Member

pbalcer commented Jan 12, 2023

  1. I'm not aware of any portable and reliable way of hooking into thread construction. I'd say it would be easier to just return NULL in out_get_errormsg and handle that in out_error (by printing some static string saying that there's no memory to properly format error strings).
  2. Yes.
  3. Yes, that's my proposal for how to fix the potential out-of-memory error that might happen in tx_commit.

@bfaccini
Copy link
Contributor Author

Sorry to be late on this...

  1. I'm not aware of any portable and reliable way of hooking into thread construction. I'd say it would be easier to just return NULL in out_get_errormsg and handle that in out_error (by printing some static string saying that there's no memory to properly format error strings).

Ok, after looking more how our DAOS code uses PMDK code, I understand that threads creation is not handled in PMDK, so I better have to follow your idea ;-)
See PR-5536

  1. Yes.

And including the other occurrences you had pointed already...
This is in PR-5537

  1. Yes, that's my proposal for how to fix the potential out-of-memory error that might happen in tx_commit.

Hmm, I have tried to do something in this direction also in PR-5537 ...

@grom72 grom72 added this to the 1.14 milestone May 15, 2023
@grom72 grom72 added the DAOS DAOS-related label May 25, 2023
@janekmi janekmi modified the milestones: 2.0.0, 2.0.1 Jul 21, 2023
@janekmi janekmi modified the milestones: 2.0.1, 2.0.2 Nov 23, 2023
@janekmi janekmi modified the milestones: 2.1.0, 2.x Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DAOS DAOS-related Priority: 3 medium Type: Bug A previously unknown bug in PMDK
Projects
None yet
Development

No branches or pull requests

4 participants