-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
Type: DefectIncorrect behavior (e.g. crash, hang)Incorrect behavior (e.g. crash, hang)
Description
System information
Type | Version/Name |
---|---|
Distribution Name | |
Distribution Version | |
Kernel Version | |
Architecture | |
OpenZFS Version | 69b65dd (master) |
Describe the problem you're observing
A user noticed that blk-mq ZVOL writes were always async and were not going to the SLOG, even after a flush. This did not happen when using the BIO ZVOL write codepath (which is the default). Further investigation showed that we need the ZVOL blk-mq layer to call blk_queue_write_cache()
to tell the kernel the ZVOL has a "volitile write cache", and can support IO operations like REQ_FUA
:
Implementation details for request_fn based block drivers
--------------------------------------------------------------
For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:
blk_queue_write_cache(sdkp->disk->queue, true, false);
and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:
blk_queue_write_cache(sdkp->disk->queue, true, true);
and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn. If the FUA bit is not natively supported the block
layer turns it into an empty REQ_OP_FLUSH request after the actual write.
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
We may also need to properly support REQ_PREFLUSH
in the blk-mq codepath.
Describe how to reproduce the problem
Set zfs module param zvol_use_blk_mq=1
, import a pool, and attempt to do a synchronous write to a zvol.
Include any warning/errors/backtraces from the system logs
Metadata
Metadata
Assignees
Labels
Type: DefectIncorrect behavior (e.g. crash, hang)Incorrect behavior (e.g. crash, hang)