-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit block device plugging when submitting multiple BIOs. #5181
Conversation
Forgot to mention: /sys/block/*/queue/max_sectors_kb needs to be large enough to enable large IO requests. |
I don't think EL (2.6.36) kernels have the explicit block plug/unplug functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a criticism of this patch, but when looking at how this integrated with the bio generation, I see that __vdev_disk_physio() is calling vdev_disk_dio_alloc(bio_count = 16), but there isn't a clear reason for that magic number "16". Is that the maximum number of bios based on 16MB blocksize and 1MB bio size, in which case it should be biocount = (SPA_MAXBLOCKSIZE / (PAGE_SIZE * BIO_MAX_PAGES)), or is there some other reason for this value?
It looks like the queue plugging interface was added in commit v2.6.38-rc6-12-g73c1010, which puts it just after EL6 if one goes by the kernel version. However, it wouldn't be surprising if RHEL backported this patch to their kernel, but that doesn't seem to be the case: zfs-0.7.0/module/zfs/vdev_disk.c: In function '__vdev_disk_physio': |
Will add a configure test. |
@adilger I think the bio_count can be set to zio size divided by max bio size, or in the case of abd simply set to the # of frags divided by BIO_MAX_PAGES. Or we'd even consider the drive's max_sectors_kb. It can be a lot smarter than a simple 16. |
Without plugging, the default 'noop' scheduler will not merge the BIOs which are part of a large ZIO. Signed-off-by: Isaac Huang <he.huang@intel.com>
0fa2c70
to
de80391
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention: /sys/block/*/queue/max_sectors_kb needs to be large enough to enable large IO requests.
It would be very helpful to add this comment to commit message. In this case I should be able to do that during the merge so don't worry about refreshing the patch.
@adilger thanks for pointing out this interface! As for the bio_count = 16
that's still a remnant from the original implementation which never go cleaned up. It was selected to be large enough that for the common expected case but small enough to be an easy allocation. Now that we have large blocks it would be great is somebody revisited this and cleaned it up (in a different PR).
Without plugging, the default 'noop' scheduler will not merge the BIOs which are part of a large ZIO. Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Isaac Huang <he.huang@intel.com> Closes openzfs#5181
Without plugging, the default 'noop' scheduler will not merge the BIOs which are part of a large ZIO. Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Isaac Huang <he.huang@intel.com> Closes openzfs#5181 Requires-builders: style
Without plugging, the default 'noop' scheduler will not merge the BIOs which are part of a large ZIO. Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Isaac Huang <he.huang@intel.com> Closes #5181
Without plugging, the default 'noop' scheduler will not merge the BIOs which are part of a large ZIO. The Linux BIO currently is limited to 1M, so this would only affect ZFS blocks larger than 1M.
The graph below shows average write request size during copying an 1G ubuntu image to a single drive zpool with 16M recordsize:
The requests reached at about 1M. The next graph shows the same thing with this patch applied:
The requests reached at about 5M.