Skip to content

ZVOL: need to set blk_queue_write_cache() for blk-mq #17698

@tonyhutter

Description

@tonyhutter

System information

Type Version/Name
Distribution Name
Distribution Version
Kernel Version
Architecture
OpenZFS Version 69b65dd (master)

Describe the problem you're observing

A user noticed that blk-mq ZVOL writes were always async and were not going to the SLOG, even after a flush. This did not happen when using the BIO ZVOL write codepath (which is the default). Further investigation showed that we need the ZVOL blk-mq layer to call blk_queue_write_cache() to tell the kernel the ZVOL has a "volitile write cache", and can support IO operations like REQ_FUA:

Implementation details for request_fn based block drivers
--------------------------------------------------------------

For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload.  For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:

	blk_queue_write_cache(sdkp->disk->queue, true, false);

and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer.  For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:

	blk_queue_write_cache(sdkp->disk->queue, true, true);

and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn.  If the FUA bit is not natively supported the block
layer turns it into an empty REQ_OP_FLUSH request after the actual write.

https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt

We may also need to properly support REQ_PREFLUSH in the blk-mq codepath.

Describe how to reproduce the problem

Set zfs module param zvol_use_blk_mq=1, import a pool, and attempt to do a synchronous write to a zvol.

Include any warning/errors/backtraces from the system logs

Metadata

Metadata

Assignees

Labels

Type: DefectIncorrect behavior (e.g. crash, hang)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions