Skip to content

Commit

Permalink
Reduce and handle EAGAIN errors on AIO label reads
Browse files Browse the repository at this point in the history
At least FreeBSD has a limit of 256 simultaneous AIO requests per
process. Attempt to issue more results in EAGAIN errors. Since we
issue 4 requests per disk/partition from 2xCPUs threads, it is
quite easy to reach that limit on large systems, that results in
random pool import failures.  It annoyed me for quite a while on
a system with 64 CPUs and 70+ partitioned disks.

This patch from one side limits the number of threads to avoid the
error, while from another should softly fall back to sync reads in
case of error.  It takes into account _SC_AIO_MAX as a system-wide
AIO limit and _SC_AIO_LISTIO_MAX as a closest value to per-process
limit.  The last not exactly right, but it is the best I found.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
  • Loading branch information
amotin committed Sep 19, 2024
1 parent 4d469ac commit b60db7b
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion lib/libzutil/zutil_import.c
Original file line number Diff line number Diff line change
Expand Up @@ -1071,6 +1071,7 @@ zpool_read_label(int fd, nvlist_t **config, int *num_labels)
* Try the slow method.
*/
zfs_fallthrough;
case EAGAIN:
case EOPNOTSUPP:
case ENOSYS:
do_slow = B_TRUE;
Expand Down Expand Up @@ -1464,7 +1465,21 @@ zpool_find_import_impl(libpc_handle_t *hdl, importargs_t *iarg,
* validating labels, a large number of threads can be used due to
* minimal contention.
*/
t = tpool_create(1, 2 * sysconf(_SC_NPROCESSORS_ONLN), 0, NULL);
long threads = 2 * sysconf(_SC_NPROCESSORS_ONLN);
#ifdef HAVE_AIO_H
long am;
#ifdef _SC_AIO_LISTIO_MAX
am = sysconf(_SC_AIO_LISTIO_MAX);
if (am >= VDEV_LABELS)
threads = MIN(threads, am / VDEV_LABELS);
#endif
#ifdef _SC_AIO_MAX
am = sysconf(_SC_AIO_MAX);
if (am >= VDEV_LABELS)
threads = MIN(threads, am / VDEV_LABELS);
#endif
#endif
t = tpool_create(1, threads, 0, NULL);
for (slice = avl_first(cache); slice;
(slice = avl_walk(cache, slice, AVL_AFTER)))
(void) tpool_dispatch(t, zpool_open_func, slice);
Expand Down

0 comments on commit b60db7b

Please sign in to comment.