Skip to content

Commit

Permalink
CORTX-31844: Client crashes on too many failures (Seagate#1838) (Seag…
Browse files Browse the repository at this point in the history
…ate#1856)


Problem: as desribed in issue Seagate#1838, there exists a case in which
the failure domain tree built for a pool version has 4 levels
(root, M0_CONF_PVER_LVL_ENCLS, M0_CONF_PVER_LVL_CTRLS,
M0_CONF_PVER_LVL_DRIVES), the minimum number of children at top 3
level are 3, 1, 0.

Although at the end of symm_tree_attr_get(), it
calls tolerance_check() to check failure settings, tolerance_check()
only checks the top 2 levels and ignores the 3rd level
(M0_CONF_PVER_LVL_CTRLS).

After the above checks, m0_fd__tile_init() is called and it calls
pool_width_calc() which asserts that the minimum
number of children at the top 3 level (root, M0_CONF_PVER_LVL_ENCLS,
M0_CONF_PVER_LVL_CTRLS) is not 0, but as the number at the
M0_CONF_PVER_LVL_CTRLS is 0, that leads to the panic.

Solution: to avoid the panic, adding a check in
symm_tree_attr_get() to ensure the minimum number of children at
each level must be greater than 0, otherwise -EINVAL is returned.

* conf: check pvs_tolerance is greater than 0 before decreasing it

Signed-off-by: Sining Wu <sining.wu@seagate.com>
  • Loading branch information
siningwuseagate authored and Mehul Joshi committed Jul 13, 2022
1 parent 5b68f63 commit b6b9208
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 1 deletion.
5 changes: 4 additions & 1 deletion conf/pvers.c
Original file line number Diff line number Diff line change
Expand Up @@ -702,8 +702,11 @@ static int conf_pver_tolerance_adjust(struct m0_conf_pver *pver)

do {
rc = m0_fd_tolerance_check(pver, &level);
if (rc == -EINVAL)
if (rc == -EINVAL &&
pver->pv_u.subtree.pvs_tolerance[level] > 0)
M0_CNT_DEC(pver->pv_u.subtree.pvs_tolerance[level]);
else
break;
} while (rc == -EINVAL);
return rc;
}
Expand Down
2 changes: 2 additions & 0 deletions fd/fd.c
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,8 @@ static int symm_tree_attr_get(const struct m0_conf_pver *pv, uint32_t *depth,
if (rc < 0)
return M0_RC(rc);
children_nr[i] = level_info.psi_nr_objs;
if (children_nr[i] == 0)
return M0_ERR(-EINVAL);
}
/*
* Total number of leaf nodes can be calculated by reducing elements of
Expand Down

0 comments on commit b6b9208

Please sign in to comment.