Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test case: add_nested_replacing_spare #7247

Closed
behlendorf opened this issue Feb 27, 2018 · 1 comment · Fixed by #7342
Closed

Test case: add_nested_replacing_spare #7247

behlendorf opened this issue Feb 27, 2018 · 1 comment · Fixed by #7342
Assignees
Labels
Component: Test Suite Indicates an issue with the test framework or a test case

Comments

@behlendorf
Copy link
Contributor

Type Version/Name
Distribution Name Ubuntu
Distribution Version 17.10
Linux Kernel
Architecture x86_64
ZFS Version zfs-0.7.0-334-g7088545
SPL Version spl-0.7.0-29-g378c6ed

Describe the problem you're observing

Rare failure of add_nested_replacing_spare test case.

Describe how to reproduce the problem

Rarely reproduced by buildbot during automated testing.

Include any warning/errors/backtraces from the system logs

http://build.zfsonlinux.org/builders/Ubuntu%2017.10%20x86_64%20%28TEST%29/builds/347

Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zpool_add/add_nested_replacing_spare (run as root) [01:37] [FAIL]
20:48:57.15 ASSERTION: 'zpool add' works with nested replacing/spare vdevs
20:48:57.16 SUCCESS: mkdir /var/tmp/zed
20:48:57.17 SUCCESS: touch /var/tmp/zed/vdev_id.conf
20:48:57.17 SUCCESS: ln -s /var/tmp/zed/vdev_id.conf /etc/zfs/vdev_id.conf
20:48:57.18 SUCCESS: cp /etc/zfs/zed.d/zed.rc /var/tmp/zed
20:48:57.19 SUCCESS: cp /etc/zfs/zed.d/zed-functions.sh /var/tmp/zed
20:48:57.19 SUCCESS: sed -i /\#ZED_DEBUG_LOG=.*/d /var/tmp/zed/zed.rc
20:48:57.20 SUCCESS: umask 0022
20:48:57.21 SUCCESS: cp /usr/lib/x86_64-linux-gnu/zfs/zed.d/all-syslog.sh /var/tmp/zed
20:48:57.21 SUCCESS: cp /usr/lib/x86_64-linux-gnu/zfs/zed.d/all-debug.sh /var/tmp/zed
20:48:57.22 SUCCESS: umask 0000
20:48:57.23 NOTE: Starting ZED
20:48:57.24 SUCCESS: truncate -s 0 /var/tmp/zed/zed.debug.log
20:48:57.24 SUCCESS: eval zed -vF -d /var/tmp/zed -p /var/tmp/zed/zed.pid -s /var/tmp/zed/state 2>/var/tmp/zed/zed.log &
20:48:58.57 SUCCESS: zpool create testpool mirror /mnt/fault-dev /mnt/safe-dev1 /mnt/safe-dev2 /mnt/safe-dev3
20:48:58.71 SUCCESS: zpool add testpool spare /mnt/spare-dev1
20:48:58.72 Added handler 1 with the following properties:
20:48:58.72   pool: testpool
20:48:58.72   vdev: 51cb22be12bd2f68
20:48:58.73 SUCCESS: zinject -d /mnt/fault-dev -e nxio -T all -f 100 testpool
20:48:58.83 SUCCESS: zpool scrub testpool
20:48:58.87 SUCCESS: wait_vdev_state testpool /mnt/fault-dev UNAVAIL 60
20:49:30.37 SUCCESS: wait_vdev_state testpool /mnt/spare-dev1 ONLINE 60
20:49:30.40 SUCCESS: wait_hotspare_state testpool /mnt/spare-dev1 INUSE
20:49:30.42 SUCCESS: check_state testpool  DEGRADED
20:49:30.65 SUCCESS: zpool replace testpool /mnt/fault-dev /mnt/replace-dev
20:49:30.71 SUCCESS: wait_vdev_state testpool /mnt/replace-dev ONLINE 60
20:49:31.03 SUCCESS: zpool add testpool spare /mnt/spare-dev2
20:49:31.07 SUCCESS: wait_hotspare_state testpool /mnt/spare-dev2 AVAIL
20:49:31.35 SUCCESS: zpool add -f testpool /mnt/add-dev
20:49:31.38 SUCCESS: wait_vdev_state testpool /mnt/add-dev ONLINE 60
20:49:31.39 removed all registered handlers
20:49:31.40 SUCCESS: zinject -c all
20:49:31.66 SUCCESS: zpool destroy -f testpool
20:49:31.68 SUCCESS: rm -f /mnt/fault-dev /mnt/safe-dev1 /mnt/safe-dev2 /mnt/safe-dev3 /mnt/replace-dev /mnt/add-dev /mnt/spare-dev1 /mnt/spare-dev2
20:49:32.00 SUCCESS: zpool create testpool raidz1 /mnt/fault-dev /mnt/safe-dev1 /mnt/safe-dev2 /mnt/safe-dev3
20:49:32.16 SUCCESS: zpool add testpool spare /mnt/spare-dev1
20:49:32.17 Added handler 2 with the following properties:
20:49:32.17   pool: testpool
20:49:32.17   vdev: f209dc2f3fe83d73
20:49:32.18 SUCCESS: zinject -d /mnt/fault-dev -e nxio -T all -f 100 testpool
20:49:32.28 SUCCESS: zpool scrub testpool
20:49:32.31 SUCCESS: wait_vdev_state testpool /mnt/fault-dev UNAVAIL 60
20:50:33.21 ERROR: wait_vdev_state testpool /mnt/spare-dev1 ONLINE 60 exited 1
@behlendorf behlendorf added the Component: Test Suite Indicates an issue with the test framework or a test case label Feb 27, 2018
@loli10K
Copy link
Contributor

loli10K commented Mar 1, 2018

This can, unfortunately, fail in another way: http://build.zfsonlinux.org/builders/CentOS%207%20x86_64%20Mainline%20%28TEST%29/builds/5725/steps/shell_8/logs/log

Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zpool_add/add_nested_replacing_spare (run as root) [00:00] [FAIL]
21:52:46.89 ASSERTION: 'zpool add' works with nested replacing/spare vdevs
21:52:46.90 SUCCESS: mkdir /var/tmp/zed
21:52:46.90 SUCCESS: touch /var/tmp/zed/vdev_id.conf
21:52:46.91 SUCCESS: ln -s /var/tmp/zed/vdev_id.conf /etc/zfs/vdev_id.conf
21:52:46.92 SUCCESS: cp /etc/zfs/zed.d/zed.rc /var/tmp/zed
21:52:46.92 SUCCESS: cp /etc/zfs/zed.d/zed-functions.sh /var/tmp/zed
21:52:46.93 SUCCESS: sed -i /\#ZED_DEBUG_LOG=.*/d /var/tmp/zed/zed.rc
21:52:46.94 NOTE: Starting ZED
21:52:46.95 SUCCESS: truncate -s 0 /var/tmp/zed/zed.debug.log
21:52:46.95 SUCCESS: eval zed -vF -d /var/tmp/zed -p /var/tmp/zed/zed.pid -P /var/tmp/constrained_path.T0sJ -s /var/tmp/zed/state 2>/var/tmp/zed/zed.log &
21:52:47.12 SUCCESS: zpool create testpool mirror /mnt/fault-dev /mnt/safe-dev1 /mnt/safe-dev2 /mnt/safe-dev3
21:52:47.20 SUCCESS: zpool add testpool spare /mnt/spare-dev1
21:52:47.21 Added handler 273 with the following properties:
21:52:47.21   pool: testpool
21:52:47.21   vdev: 3e41072b2fd21d1a
21:52:47.21 SUCCESS: zinject -d /mnt/fault-dev -e nxio -T all -f 100 testpool
21:52:47.32 cannot scrub testpool: currently resilvering
21:52:47.32 ERROR: zpool scrub testpool exited 1

It's not yet clear to me if/how zinject is responsible for starting a resilver.

@ofaaland ofaaland mentioned this issue Mar 9, 2018
13 tasks
@loli10K loli10K self-assigned this Mar 22, 2018
behlendorf pushed a commit that referenced this issue Apr 4, 2018
Use 'zpool reopen' instead of 'zpool scrub' to kick in the spare device:
this is required to avoid spurious failures caused by a race condition
in events processing by the ZFS Event Daemon:

P1 (zpool scrub)                            P2 (zed)
---
zfs_ioc_pool_scan()
 -> dsl_scan()
  -> vdev_reopen()
   -> vdev_set_state(VDEV_STATE_CANT_OPEN)
                                            zfs_ioc_vdev_attach()
                                             -> spa_vdev_attach()
                                              -> dsl_resilver_restart()
  -> dsl_sync_task()
   -> dsl_scan_setup_check()
   <- dsl_scan_setup_check(): EBUSY

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7247 
Closes #7342
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants