-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zed: Misc multipath autoreplace fixes #13023
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tonyhutter
force-pushed
the
autoreplace-mpath
branch
2 times, most recently
from
January 28, 2022 01:38
e696eb7
to
a5da51b
Compare
tonyhutter
force-pushed
the
autoreplace-mpath
branch
3 times, most recently
from
February 10, 2022 20:24
19ab2be
to
145a3d9
Compare
behlendorf
reviewed
Feb 10, 2022
tonyhutter
force-pushed
the
autoreplace-mpath
branch
3 times, most recently
from
February 15, 2022 18:36
161848d
to
76f3ad7
Compare
@behlendorf this has been updated and is now passing all the bots. Would you mind taking another look when you get a chance |
behlendorf
reviewed
Feb 16, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, just a few comments.
tonyhutter
force-pushed
the
autoreplace-mpath
branch
from
February 23, 2022 22:22
76f3ad7
to
d15ca2a
Compare
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths wern't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Signed-off-by: Tony Hutter <hutter2@llnl.gov>
tonyhutter
force-pushed
the
autoreplace-mpath
branch
from
February 24, 2022 00:09
d15ca2a
to
f5984d0
Compare
behlendorf
approved these changes
Feb 24, 2022
behlendorf
added
Status: Accepted
Ready to integrate (reviewed, tested)
and removed
Status: Code Review Needed
Ready for review and testing
labels
Feb 24, 2022
tonyhutter
added a commit
to tonyhutter/zfs
that referenced
this pull request
Mar 21, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
nicman23
pushed a commit
to nicman23/zfs
that referenced
this pull request
Aug 22, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
nicman23
pushed a commit
to nicman23/zfs
that referenced
this pull request
Aug 22, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Aug 30, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
lundman
pushed a commit
to openzfsonwindows/openzfs
that referenced
this pull request
Sep 1, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
andrewc12
pushed a commit
to andrewc12/openzfs
that referenced
this pull request
Sep 23, 2022
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths weren't created at the time the udev event came in to zed, but this was never verified. This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk, rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk. Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
Fix multipath autoreplace not working in some cases
Description
We recently had a case where our operators replaced a bad multipathed disk, only to see it fail to autoreplace. The zed logs showed that the multipath replacement disk did not pass the 'is_dm' test in zfs_process_add() even though it should have. is_dm is set if there exists a sysfs entry for to the underlying /dev/sd* paths for the multipath disk. It's possible this path didn't exist due to a race condition where the sysfs paths wern't created at the time the udev event came in to zed.
This patch updates the check to look for udev properties that indicate if the new autoreplace disk is an empty multipath disk,
rather than looking for the underlying sysfs entries. It also adds in additional logging, and fixes a bug where zed allowed
you to use an already zfs-formatted disk from another pool as a multipath auto-replacement disk.
Furthermore, while testing this patch, I also ran across a case where a force-faulted disk did not have a ZPOOL_CONFIG_PHYS_PATH entry in its config. This prevented it from being autoreplaced. I added additional logic to derive the PHYS_PATH from the PATH if the PATH was a /dev/disk/by-vdev/ path. For example, if PATH was /dev/disk/by-vdev/L28, then PHYS_PATH would be L28. This is safe since by-vdev paths represent physical locations and do not change between boots.
How Has This Been Tested?
Tested by force faulting a disk and verifying autorplace worked. Also verified that you can't do a multipath autoreplace with a zfs-formatted disk, or with a disk that has existing partitions.
Types of changes
Checklist:
Signed-off-by
.