-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs-progs: convert: fix inline extent size for symbol link #884
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[BUG] For btrfs converted from ext* or reiserfs, the inlined data extent is always one byte larger than the inode size: item 10 key (267 INODE_ITEM 0) itemoff 15543 itemsize 160 generation 1 transid 1 size 4 nbytes 5 block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0 sequence 0 flags 0x0(none) item 11 key (267 INODE_REF 256) itemoff 15529 itemsize 14 index 3 namelen 4 name: path item 12 key (267 EXTENT_DATA 0) itemoff 15503 itemsize 26 generation 4 type 0 (inline) inline extent data size 5 ram_bytes 5 compression 0 (none) [CAUSE] Inside the symbol link creation path for each fs, they all create the inline data extent with a tailing NUL ('\0'). This is different from what btrfs kernel module does: item 4 key (257 INODE_ITEM 0) itemoff 15883 itemsize 160 generation 9 transid 9 size 4 nbytes 4 block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0 sequence 0 flags 0x0(none) item 5 key (257 INODE_REF 256) itemoff 15869 itemsize 14 index 2 namelen 4 name: path item 6 key (257 EXTENT_DATA 0) itemoff 15844 itemsize 25 generation 9 type 0 (inline) inline extent data size 4 ram_bytes 4 compression 0 (none) [FIX] Thankfully this is not a big deal, kernel properly reads the content and use inode size to determine the proper link target. Just align the btrfs-progs convert behavior to the kernel one. Signed-off-by: Qu Wenruo <wqu@suse.com>
Closing because the fix is not a simple no-harm fix, but it's already causing bugs in btrfs/012 test case. This will need to be a multi-patch series to not only fix the +1 error, but also new sanity checks for it, along with new test case. |
kdave
pushed a commit
that referenced
this pull request
Sep 12, 2024
[BUG] Sometimes test case btrfs/012 fails randomly, with the failure to read a symlink: QA output created by 012 Checking converted btrfs against the original one: -OK +readlink: Structure needs cleaning Checking saved ext2 image against the original one: OK Furthermore, this will trigger a kernel error message: BTRFS critical (device dm-2): regular/prealloc extent found for non-regular inode 133081 [CAUSE] For that specific inode 133081, the tree dump looks like this: item 127 key (133081 INODE_ITEM 0) itemoff 40984 itemsize 160 generation 1 transid 1 size 4095 nbytes 4096 block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0 sequence 0 flags 0x0(none) item 128 key (133081 INODE_REF 133080) itemoff 40972 itemsize 12 index 2 namelen 2 name: l3 item 129 key (133081 EXTENT_DATA 0) itemoff 40919 itemsize 53 generation 4 type 1 (regular) extent data disk byte 2147483648 nr 38080512 extent data offset 37974016 nr 4096 ram 38080512 extent compression 0 (none) Note that, the symlink inode size is 4095 at the max size (PATH_MAX, removing the terminating NUL). But the nbytes is 4096, exactly matching the sector size of the btrfs. Thus it results the creation of a regular extent, but for btrfs we do not accept a symlink with a regular/preallocated extent, thus kernel rejects such read and failed the readlink call. The root cause is in the convert code, where for symlinks we always create a data extent with its size + 1, causing the above problem. I guess the original code is to handle the terminating NUL, but in btrfs we never need to store the terminating NUL for inline extents nor file names. Thus this pitfall in btrfs-convert leads to the above invalid data extent and fail the test case. [FIX] - Fix the ext2 and reiserfs symbolic link creation code To remove the terminating NUL. - Add extra checks for the size of a symbolic link Btrfs has extra limits on the size of a symbolic link, as btrfs must store symbolic link targets as inlined extents. This means for 4K node sized btrfs, the size limit is smaller than the usual PATH_MAX - 1 (only around 4000 bytes instead of 4095). So for certain nodesize, some filesystems can not be converted to btrfs. (this should be rare, because the default nodesize is 16K already) - Split the symbolic link and inline data extent size checks For symbolic links the real limit is PATH_MAX - 1 (removing the terminating NUL), but for inline data extents the limit is sectorsize - 1, which can be different from 4096 - 1 (e.g. 64K sector size). Pull-request: #884 Signed-off-by: Qu Wenruo <wqu@suse.com>
kdave
pushed a commit
that referenced
this pull request
Sep 12, 2024
[BUG] Sometimes test case btrfs/012 fails randomly, with the failure to read a symlink: QA output created by 012 Checking converted btrfs against the original one: -OK +readlink: Structure needs cleaning Checking saved ext2 image against the original one: OK Furthermore, this will trigger a kernel error message: BTRFS critical (device dm-2): regular/prealloc extent found for non-regular inode 133081 [CAUSE] For that specific inode 133081, the tree dump looks like this: item 127 key (133081 INODE_ITEM 0) itemoff 40984 itemsize 160 generation 1 transid 1 size 4095 nbytes 4096 block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0 sequence 0 flags 0x0(none) item 128 key (133081 INODE_REF 133080) itemoff 40972 itemsize 12 index 2 namelen 2 name: l3 item 129 key (133081 EXTENT_DATA 0) itemoff 40919 itemsize 53 generation 4 type 1 (regular) extent data disk byte 2147483648 nr 38080512 extent data offset 37974016 nr 4096 ram 38080512 extent compression 0 (none) Note that, the symlink inode size is 4095 at the max size (PATH_MAX, removing the terminating NUL). But the nbytes is 4096, exactly matching the sector size of the btrfs. Thus it results the creation of a regular extent, but for btrfs we do not accept a symlink with a regular/preallocated extent, thus kernel rejects such read and failed the readlink call. The root cause is in the convert code, where for symlinks we always create a data extent with its size + 1, causing the above problem. I guess the original code is to handle the terminating NUL, but in btrfs we never need to store the terminating NUL for inline extents nor file names. Thus this pitfall in btrfs-convert leads to the above invalid data extent and fail the test case. [FIX] - Fix the ext2 and reiserfs symbolic link creation code To remove the terminating NUL. - Add extra checks for the size of a symbolic link Btrfs has extra limits on the size of a symbolic link, as btrfs must store symbolic link targets as inlined extents. This means for 4K node sized btrfs, the size limit is smaller than the usual PATH_MAX - 1 (only around 4000 bytes instead of 4095). So for certain nodesize, some filesystems can not be converted to btrfs. (this should be rare, because the default nodesize is 16K already) - Split the symbolic link and inline data extent size checks For symbolic links the real limit is PATH_MAX - 1 (removing the terminating NUL), but for inline data extents the limit is sectorsize - 1, which can be different from 4096 - 1 (e.g. 64K sector size). Pull-request: #884 Signed-off-by: Qu Wenruo <wqu@suse.com>
kdave
pushed a commit
that referenced
this pull request
Sep 17, 2024
[BUG] Sometimes test case btrfs/012 fails randomly, with the failure to read a symlink: QA output created by 012 Checking converted btrfs against the original one: -OK +readlink: Structure needs cleaning Checking saved ext2 image against the original one: OK Furthermore, this will trigger a kernel error message: BTRFS critical (device dm-2): regular/prealloc extent found for non-regular inode 133081 [CAUSE] For that specific inode 133081, the tree dump looks like this: item 127 key (133081 INODE_ITEM 0) itemoff 40984 itemsize 160 generation 1 transid 1 size 4095 nbytes 4096 block group 0 mode 120777 links 1 uid 0 gid 0 rdev 0 sequence 0 flags 0x0(none) item 128 key (133081 INODE_REF 133080) itemoff 40972 itemsize 12 index 2 namelen 2 name: l3 item 129 key (133081 EXTENT_DATA 0) itemoff 40919 itemsize 53 generation 4 type 1 (regular) extent data disk byte 2147483648 nr 38080512 extent data offset 37974016 nr 4096 ram 38080512 extent compression 0 (none) Note that, the symlink inode size is 4095 at the max size (PATH_MAX, removing the terminating NUL). But the nbytes is 4096, exactly matching the sector size of the btrfs. Thus it results the creation of a regular extent, but for btrfs we do not accept a symlink with a regular/preallocated extent, thus kernel rejects such read and failed the readlink call. The root cause is in the convert code, where for symlinks we always create a data extent with its size + 1, causing the above problem. I guess the original code is to handle the terminating NUL, but in btrfs we never need to store the terminating NUL for inline extents nor file names. Thus this pitfall in btrfs-convert leads to the above invalid data extent and fail the test case. [FIX] - Fix the ext2 and reiserfs symbolic link creation code To remove the terminating NUL. - Add extra checks for the size of a symbolic link Btrfs has extra limits on the size of a symbolic link, as btrfs must store symbolic link targets as inlined extents. This means for 4K node sized btrfs, the size limit is smaller than the usual PATH_MAX - 1 (only around 4000 bytes instead of 4095). So for certain nodesize, some filesystems can not be converted to btrfs. (this should be rare, because the default nodesize is 16K already) - Split the symbolic link and inline data extent size checks For symbolic links the real limit is PATH_MAX - 1 (removing the terminating NUL), but for inline data extents the limit is sectorsize - 1, which can be different from 4096 - 1 (e.g. 64K sector size). Pull-request: #884 Signed-off-by: Qu Wenruo <wqu@suse.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[BUG]
For btrfs converted from ext* or reiserfs, the inlined data extent is always one byte larger than the inode size:
[CAUSE]
Inside the symbol link creation path for each fs, they all create the inline data extent with a tailing NUL ('\0').
This is different from what btrfs kernel module does:
[FIX]
Thankfully this is not a big deal, kernel properly reads the content and use inode size to determine the proper link target.
Just align the btrfs-progs convert behavior to the kernel one.