Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--is-reflink false positive: rm_util_link_type can miss holes before extents #611

Closed
cebtenzzre opened this issue Feb 22, 2023 · 0 comments · May be fixed by #531
Closed

--is-reflink false positive: rm_util_link_type can miss holes before extents #611

cebtenzzre opened this issue Feb 22, 2023 · 0 comments · May be fixed by #531
Labels

Comments

@cebtenzzre
Copy link
Contributor

cebtenzzre commented Feb 22, 2023

Description

rm_offset_get_from_fd quietly skips over holes until it finds an extent, so unless there are other differences between the files rmlint will not notice them.

Steps to reproduce

Build rmlint from the master branch with _RM_OFFSET_DEBUG=1 and fixes for #527, #528, and #529 applied.
Then run these commands:

$ dd if=/dev/urandom of=foo bs=200K oflag=sync count=2
$ truncate -s 400K bar
$ xfs_io -f -c 'copy_range -l 196K -d 4K foo' -c 'fsync' bar
$ xfs_io -f -c 'copy_range -l 200K -s 200K -d 200K foo' -c 'fsync' bar
$ filefrag -vb1 foo bar
Filesystem type is: 9123683e
File size of foo is 409600 (409600 blocks of 1 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..  204799: 1387178913792..1387179118591: 204800:             shared
   1:   204800..  409599: 1399459041280..1399459246079: 204800: 1387179118592: last,shared,eof
foo: 2 extents found
File size of bar is 409600 (409600 blocks of 1 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:     4096..  204799: 1387178913792..1387179114495: 200704:       4096: shared
   1:   204800..  409599: 1399459041280..1399459246079: 204800: 1387179114496: last,shared,eof
bar: 2 extents found
$ cmp foo bar
foo bar differ: byte 1, line 1
$ rmlint --is-reflink -vv foo bar
DEBUG: Testing if /home/cebtenzzre/bar is clone of /home/cebtenzzre/foo
DEBUG: Checking link type for /home/cebtenzzre/bar vs /home/cebtenzzre/foo
DEBUG: rm_offset_get_fiemap: fd=5, n_extents=1, file_offset=0
DEBUG: rm_offset_get_fiemap: fd=5, n_extents=1, file_offset=204800
DEBUG: rm_offset_get_fiemap: fd=6, n_extents=1, file_offset=0
DEBUG: rm_offset_get_fiemap: fd=6, n_extents=1, file_offset=204800
DEBUG: Offsets match at fd1=5, fd2=6, logical=0, physical=1387178913792
DEBUG: rm_offset_get_fiemap: fd=5, n_extents=1, file_offset=204800
DEBUG: rm_offset_get_fiemap: fd=6, n_extents=1, file_offset=204800
DEBUG: Offsets match at fd1=5, fd2=6, logical=204800, physical=1399459041280
DEBUG: Files are clones (share same data)
DEBUG: Offsets match
$ echo $?
0

Actual result

--is-reflink exits with status 0 (Files are reflinked) despite cmp clearly showing that the files have different content. Note that according to filefrag the physical offsets match and the extents logically end at the same place. Since those are the only things rmlint checks for, it thinks the files are reflinked.

Expected result

I would expect rmlint to report that the files are not reflinked - they have different content, after all.

$ rmlint --is-reflink -vv foo bar
DEBUG: Testing if /home/cebtenzzre/bar is clone of /home/cebtenzzre/foo
DEBUG: Checking link type for /home/cebtenzzre/bar vs /home/cebtenzzre/foo
DEBUG: rm_offset_get_fiemap: fd=5, n_extents=1, file_offset=0
DEBUG: rm_offset_get_fiemap: fd=5, n_extents=1, file_offset=204800
DEBUG: rm_offset_get_fiemap: fd=6, n_extents=1, file_offset=0
DEBUG: rm_offset_get_fiemap: fd=6, n_extents=1, file_offset=204800
DEBUG: Logical offsets differ after 0 bytes: 4096<> 0
DEBUG: Offsets differ
$ echo $?
1
cebtenzzre added a commit to cebtenzzre/rmlint that referenced this issue Feb 24, 2023
Holes are handled by ignoring them when they are after an extent and
skipping them when they are before one. Since we were only comparing the
current file position, a file with a hole could match a file without one
as long as the physical offsets match and the extents logically end at
the same place. This led to --is-reflink false-positives.

Fix this by comparing the logical offset of each extent.

Fixes sahib#611
@cebtenzzre cebtenzzre added the bug label Feb 24, 2023
@cebtenzzre cebtenzzre linked a pull request Feb 24, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant