Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux/zvol_os.c: Fix max_discard_sectors limit for 6.8+ kernel #16462

Closed
wants to merge 2 commits into from

Conversation

ixhamza
Copy link
Member

@ixhamza ixhamza commented Aug 20, 2024

Motivation and Context

In kernels 6.8 and later, the zvol block device is allocated with qlimits passed during initialization. However, the zvol driver does not set max_hw_discard_sectors, which is necessary to properly initialize max_discard_sectors. This causes the zvol_misc_trim test to fail on 6.8+ kernels when invoking the blkdiscard command. Setting max_hw_discard_sectors in the HAVE_BLK_ALLOC_DISK_2ARG case resolve the issue.

Description

How Has This Been Tested?

  • Locally tested blkdiscard command on 6.8 kernels and onwards.
  • Buildbot will test the zvol_misc_trim test on Fedora using the 6.10 kernel.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn
Copy link
Member

robn commented Aug 20, 2024

Would it not be better then to create and set zql_max_hw_discard_sectors, so the kernel gets what it expects?

@@ -1266,6 +1266,8 @@ zvol_alloc_non_blk_mq(struct zvol_state_os *zso, zvol_queue_limits_t *limits)
zso->zvo_queue = zso->zvo_disk->queue;

#ifndef HAVE_BLKDEV_QUEUE_LIMITS_FEATURES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why I didn't just call zvol_queue_limits_apply(limits, zso->zvo_queue) here, since that was always the point of convert/apply: everything in one two places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why set it in two places if the limits should already be set during initialization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that you would call _convert if you had a creation function that could take limits, and regardless, calling _apply afterward in all cases would get everything left over.

(see the comment: I assert in the last paragraph that its "called on all versions").

Oh yeah, this is all in fact redundant. zvol_alloc_non_blk_mq() calls zvol_queue_limits_apply() at the very end for all cases, so this extra stuff in the middle can go.

diff --git a/module/os/linux/zfs/zvol_os.c b/module/os/linux/zfs/zvol_os.c
index e04f64e23..122fda3ea 100644
--- module/os/linux/zfs/zvol_os.c
+++ module/os/linux/zfs/zvol_os.c
@@ -1251,7 +1251,6 @@ zvol_alloc_non_blk_mq(struct zvol_state_os *zso, zvol_queue_limits_t *limits)
 
 	zso->zvo_disk->minors = ZVOL_MINORS;
 	zso->zvo_queue = zso->zvo_disk->queue;
-	zvol_queue_limits_apply(limits, zso->zvo_queue);
 #elif defined(HAVE_BLK_ALLOC_DISK_2ARG)
 	struct queue_limits qlimits;
 	zvol_queue_limits_convert(limits, &qlimits);
@@ -1264,11 +1263,6 @@ zvol_alloc_non_blk_mq(struct zvol_state_os *zso, zvol_queue_limits_t *limits)
 	zso->zvo_disk = disk;
 	zso->zvo_disk->minors = ZVOL_MINORS;
 	zso->zvo_queue = zso->zvo_disk->queue;
-
-#ifndef HAVE_BLKDEV_QUEUE_LIMITS_FEATURES
-	blk_queue_set_write_cache(zso->zvo_queue, B_TRUE);
-#endif
-
 #else
 	zso->zvo_queue = blk_alloc_queue(NUMA_NO_NODE);
 	if (zso->zvo_queue == NULL)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robn - Supporting different kernel versions has indeed made things quite complex. Hats off to you for managing it so efficiently. I will clean up as per your suggestion along with fixing the original issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry! I was just muttering out loud about my mistake; not actually suggesting you fix it!

(but if you want to anyway, I will be delighted and enthusiastically approve the PR 🙌)

@ixhamza
Copy link
Member Author

ixhamza commented Aug 20, 2024

Would it not be better then to create and set zql_max_hw_discard_sectors, so the kernel gets what it expects?

@robn - That's a good point, and I was thinking the same, but I refrained from it since max_hw_discard_sectors was introduced in version 4.3. However, while testing on the 6.11 kernel, I noticed that HAVE_BLKDEV_QUEUE_LIMITS_FEATURES is not actually defined. So, I agree that it needs to be updated anyway.

@ixhamza ixhamza marked this pull request as draft August 20, 2024 10:36
In kernels 6.8 and later, the zvol block device is allocated with
qlimits passed during initialization. However, the zvol driver does not
set `max_hw_discard_sectors`, which is necessary to properly
initialize `max_discard_sectors`. This causes the `zvol_misc_trim` test
to fail on 6.8+ kernels when invoking the `blkdiscard` command. Setting
`max_hw_discard_sectors` in the `HAVE_BLK_ALLOC_DISK_2ARG` case resolve
the issue.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Rob Noris suggested that we could clean up redundant limits for the case
of non-blk mq scenario.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
@ixhamza ixhamza marked this pull request as ready for review August 20, 2024 18:51
@ixhamza
Copy link
Member Author

ixhamza commented Aug 20, 2024

@behlendorf - both Fedora tests are passing now, other test failures look unrelated.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks guys.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 20, 2024
@@ -1213,6 +1213,7 @@ zvol_queue_limits_convert(zvol_queue_limits_t *limits,
qlimits->io_opt = limits->zql_io_opt;
qlimits->physical_block_size = limits->zql_physical_block_size;
qlimits->max_discard_sectors = limits->zql_max_discard_sectors;
qlimits->max_hw_discard_sectors = limits->zql_max_discard_sectors;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's not what I meant but I like it better, nice!

Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

behlendorf pushed a commit that referenced this pull request Aug 21, 2024
Rob Noris suggested that we could clean up redundant limits for the case
of non-blk mq scenario.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #16462
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
In kernels 6.8 and later, the zvol block device is allocated with
qlimits passed during initialization. However, the zvol driver does not
set `max_hw_discard_sectors`, which is necessary to properly
initialize `max_discard_sectors`. This causes the `zvol_misc_trim` test
to fail on 6.8+ kernels when invoking the `blkdiscard` command. Setting
`max_hw_discard_sectors` in the `HAVE_BLK_ALLOC_DISK_2ARG` case resolve
the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#16462
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
Rob Noris suggested that we could clean up redundant limits for the case
of non-blk mq scenario.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#16462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants