From e5fcb7be1b67ade7ea7441c7536c181cfbfd8456 Mon Sep 17 00:00:00 2001 From: Serapheim Dimitropoulos Date: Tue, 1 Nov 2022 17:15:05 -0700 Subject: [PATCH] Expose zfs_vdev_open_timeout_ms as a tunable Some of our customers have been occasionally hitting zfs import failures in Linux because udevd doesn't create the by-id symbolic links in time for zpool import to use them. The main issue is that the systemd-udev-settle.service that zfs-import-cache.service and other services depend on is racy. There is also an openzfs issue filed (see https://github.com/openzfs/zfs/issues/10891) outlining the problem and potential solutions. With the proper solutions being significant in terms of complexity and the priority of the issue being low for the time being, this patch exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are experiencing this issue often can increase it as a workaround. Signed-off-by: Serapheim Dimitropoulos --- man/man4/zfs.4 | 7 +++++++ module/os/linux/zfs/vdev_disk.c | 5 ++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/man/man4/zfs.4 b/man/man4/zfs.4 index 2bfa3601fbcd..5b53b1310e49 100644 --- a/man/man4/zfs.4 +++ b/man/man4/zfs.4 @@ -1249,6 +1249,13 @@ Ideally, this will be at least the sum of each queue's .Sy max_active . .No See Sx ZFS I/O SCHEDULER . . +.It Sy zfs_vdev_open_timeout_ms Ns = Ns Sy 1000 Pq uint +Timeout value to wait before determining a device is missing +during import. +This is helpful for transient missing paths due +to links being briefly removed and recreated in response to +udev events. +. .It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq uint Maximum sequential resilver I/O operations active to each device. .No See Sx ZFS I/O SCHEDULER . diff --git a/module/os/linux/zfs/vdev_disk.c b/module/os/linux/zfs/vdev_disk.c index 11ed3ea6f4e0..84d191abb90b 100644 --- a/module/os/linux/zfs/vdev_disk.c +++ b/module/os/linux/zfs/vdev_disk.c @@ -56,7 +56,7 @@ static void *zfs_vdev_holder = VDEV_HOLDER; * device is missing. The missing path may be transient since the links * can be briefly removed and recreated in response to udev events. */ -static unsigned zfs_vdev_open_timeout_ms = 1000; +static uint_t zfs_vdev_open_timeout_ms = 1000; /* * Size of the "reserved" partition, in blocks. @@ -1042,3 +1042,6 @@ param_set_max_auto_ashift(const char *buf, zfs_kernel_param_t *kp) return (0); } + +ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, open_timeout_ms, UINT, ZMOD_RW, + "Timeout before determining that a device is missing");