Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb: show BRT statistics and dump its contents #15541

Closed
wants to merge 2 commits into from

Conversation

robn
Copy link
Member

@robn robn commented Nov 18, 2023

Motivation and Context

Mostly, its good to have ways to look inside every ZFS subsystem to see how its doing, and BRT was lacking there. In particular though, being able to dump the contents may at least have some utility in recovering from some of the recent cloning-adjacent bugs.

Description

Adds a -T switch to zdb that acts rather like -D, except for the BRT.

  • -T: show some top-level stats
  • -TT: also show per-vdev stats
  • -TTT also dump the table

I've lifted the internal structures out of brt.c into brt_impl.h so that zdb can get at them. I don't use most of it, but it seems better to move them all and is consistent with other _impl.h files.

How Has This Been Tested?

Make a file, and clone it a bit:

zpool create tank loop0 loop1 loop2 loop3 loop4 loop5

dd if=/dev/random of=/tank/f bs=1M count=4 status=progress
zpool sync

for c in $(seq 32) ; do clonefile -t -c /tank/f /tank/c$c ; done
zpool sync
# zdb -T tank
BRT: used 4M; saved 128M; ratio 33.00x
# zdb -TT tank
BRT: used 4M; saved 128M; ratio 33.00x
BRT: vdev 0: refcnt 8; used 1M; saved 32M
BRT: vdev 1: refcnt 8; used 1M; saved 32M
BRT: vdev 2: refcnt 8; used 1M; saved 32M
BRT: vdev 3: empty
BRT: vdev 4: empty
BRT: vdev 5: refcnt 8; used 1M; saved 32M
# zdb -TTT tank
BRT: used 4M; saved 128M; ratio 33.00x
BRT: vdev 0: refcnt 8; used 1M; saved 32M
BRT: vdev 1: refcnt 8; used 1M; saved 32M
BRT: vdev 2: refcnt 8; used 1M; saved 32M
BRT: vdev 3: empty
BRT: vdev 4: empty
BRT: vdev 5: refcnt 8; used 1M; saved 32M

DVA              REFCNT
0:ae200          32
0:ee200          32
0:2e200          32
0:6e200          32
...

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

So that zdb (and others!) can get at the BRT on-disk structures.

Signed-off-by: Rob Norris <robn@despairlabs.com>
Copy link
Contributor

@oromenahar oromenahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super happy with -T as option, but the most other letters are in use already. But nice work. Maybe this also helps to trace down the double write problem.

@0x5c
Copy link

0x5c commented Nov 18, 2023

Is there a way for a user to associate the output of this with individual files?

@robn
Copy link
Member Author

robn commented Nov 18, 2023

@oromenahar Not super happy with -T as option, but the most other letters are in use already

Yeah, I didn't love it either, but it seemed the best of what was left. We'll have to do something about that soon (I've some ideas and plans) but its bigger than this PR of course.

@robn
Copy link
Member Author

robn commented Nov 19, 2023

@0x5c Is there a way for a user to associate the output of this with individual files?

You'd need to do something like, dump the BRT, extract the DVAs to a file, then use zdb to walk the filesystem and dump all the L0 blocks, extract the DVA from each and look it up in the dump. If its there, then that block is cloned and you can print the name of the file or something.

zdb -b is already doing something close to this; it does a full walk of the pool, counting blocks, and even does the BRT work so it can count clones. So there's an opportunity for more tooling right there if someone feels so inclined (unlikely to be me, at least not before next weekend). Otherwise, good old grep and friends will work, just slower.

cmd/zdb/zdb.c Outdated Show resolved Hide resolved
Same idea as the dedup stats, but for block cloning.

Signed-off-by: Rob Norris <robn@despairlabs.com>
@0x5c
Copy link

0x5c commented Nov 21, 2023

I successfully built zdb with this PR, and managed to get filenames from the dump using robn's instructions.
I also made a script to automate the work https://github.com/0x5c/zfs-bclonecheck

@robn
Copy link
Member Author

robn commented Nov 21, 2023

@0x5c that's neat, nice work!

@FL140
Copy link

FL140 commented Nov 22, 2023

@robn Two days ago I cloned a pool with feature@block_cloning enabled to a new disk using zfs send ... | zfs receive ..., not knowing of the existing block cloning bug. After comparing the data as an additional safety step the original pool was destroyed, so I can no longer get the properties there. The new pool shows

zpool get all XYZ.pool|grep bclone
XYZ.pool  bcloneused                     0                              -
XYZ.pool  bclonesaved                    0                              -
XYZ.pool  bcloneratio                    1.00x

Q1: Do I understand it right, that the send|receive has no impact on cloned data and the bclone* values are the same as on the original pool?
Q2: If so, does that mean there was for sure no corruption on that pool?
Thank's!

@oromenahar
Copy link
Contributor

oromenahar commented Nov 22, 2023

@FL140 send and receive doesn't support replay block cloning to the current time.
The data will be written several times. For example if you have a file 42G cloned once you need 84G on the receive destination. The clone and the original file and the clone at the don't know that this is a clone.

@FL140
Copy link

FL140 commented Nov 22, 2023

The clone and the original file and the destination don't know that this is a cone.

That is not good news, as detection in any way can only happen on the original source file system then, but thank's for the clarification!

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Nov 27, 2023
behlendorf pushed a commit that referenced this pull request Nov 27, 2023
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Nov 28, 2023
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15541
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Nov 28, 2023
So that zdb (and others!) can get at the BRT on-disk structures.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15541
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Nov 28, 2023
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15541
behlendorf pushed a commit that referenced this pull request Nov 28, 2023
So that zdb (and others!) can get at the BRT on-disk structures.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541
behlendorf pushed a commit that referenced this pull request Nov 28, 2023
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
So that zdb (and others!) can get at the BRT on-disk structures.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15541
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15541
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants