This is the userspace daemon part(ublksrv) of the ublk framework, the other
part is ublk driver
[#userspace]_ which supports multiple queue.
The two parts communicate by io_uring's IORING_OP_URING_CMD with one per-queue shared cmd buffer for storing io command, and the buffer is read only for ublksrv, each io command can be indexed by io request tag directly, and the command is written by ublk driver, and read by ublksrv after getting notification from ublk driver.
For example, when one READ io request is submitted to ublk block driver, ublk driver stores the io command into cmd buffer first, then completes one IORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to ublk driver beforehand by ublksrv for getting notification of any new io request, and each URING_CMD is associated with one io request by tag, so depth for URING_CMD is same with queue depth of ublk block device.
After ublksrv gets the io command, it translates and handles the ublk io request, such as, for the ublk-loop target, ublksrv translates the request into same request on another file or disk, like the kernel loop block driver. In ublksrv's implementation, the io is still handled by io_uring, and share same ring with IORING_OP_URING_CMD command. When the target io request is done, the same IORING_OP_URING_CMD is issued to ublk driver for both committing io request result and getting future notification of new io request.
So far, the ublk driver needs to copy io request pages into userspace buffer (pages) first for write before notifying the request to ublksrv, and copy userspace buffer(pages) to the io request pages after ublksrv handles READ. Also looks linux-mm can't support zero copy for this case yet. [2]
More ublk targets will be added with this framework in future even though only ublk-loop and ublk-null are implemented now.
libublksrv is also generated, and it helps to integrate ublk into existed project. One example of demo_null is provided for how to make a ublk device over libublksrv.
autoreconf -i
./configure #pkg-config and libtool is usually needed
make
note: './configure' requires liburing 2.2 package installed, if liburing 2.2
isn't available in your distribution, please configure via the following
command, or refer to build_with_liburing_src
[6]
PKG_CONFIG_PATH=${LIBURING_DIR} \
./configure \
CFLAGS="-I${LIBURING_DIR}/src/include" \
CXXFLAGS="-I${LIBURING_DIR}/src/include" \
LDFLAGS="-L${LIBURING_DIR}/src"
and LIBURING_DIR points to directory of liburing source code, and liburing needs to be built before running above commands. Also IORING_SETUP_SQE128 has to be supported in the liburing source.
c++20 is required for building ublk utility, but libublksrv and demo_null.c & demo_event.c can be built independently:
build libublksrv
make -C lib/
build demo_null && demo_event
make -C lib/ make demo_null demo_event
- ublk help
- ublk add -t null
- ublk add -t loop -f /dev/vdb
or
- ublk add -t loop -f 1.img
- ublk add -t qcow2 -f test.qcow2
note: qcow2 support is experimental, see details in qcow2 status [4] and readme [5]
- ublk del -n 0 #remove /dev/ublkb0
- ublk del -a #remove all ublk devices
- ublk list
- ublk list -v #with all device info dumped
Typical use case is container [7] in which user can manage its own devices not exposed to other containers.
At default, controlling ublk device needs privileged user, since /dev/ublk-control is permitted for administrator only, and this is called privileged mode.
For unprivilege mode, /dev/ublk-control needs to be allowed for all users, so the following udev rule need to be added:
KERNEL=="ublk-control", MODE="0666", OPTIONS+="static_node=ublk-control"
Also when new ublk device is added, we need ublk to change device ownership to the device's real owner, so the following rules are needed:
KERNEL=="ublkc*",RUN+="ublk_chown.sh %k" KERNEL=="ublkb*",RUN+="ublk_chown.sh %k"
ublk_chown.sh
can be found under utils/
too.
utils/ublk_dev.rules
includes the above rules.
With the above two administrator changes, unprivileged user can create/delete/list/use ublk device, also anyone which isn't permitted can't access and control this ublk devices(ublkc*/ublkb*)
Unprivileged user can pass '--unprevileged' to 'ublk add' for creating unprivileged ublk device, then the created ublk device is only available for the owner and administrator.
install the following udev rules in host machine:
ACTION=="add",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'add' '%M' '%m'" ACTION=="remove",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'remove' '%M' '%m'"
ublk_chown_docker.sh
can be found under utils/
.
- run one container and install ublk & its dependency packages
docker run \
--name fedora \
--hostname=ublk-docker.example.com \
--device=/dev/ublk-control \
--device-cgroup-rule='a *:* rmw' \
--tmpfs /tmp \
--tmpfs /run \
--volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
-ti \
fedora:38
#run the following commands inside the above container
dnf install -y git libtool automake autoconf g++ liburing-devel
git clone https://github.com/ming1/ubdsrv.git
cd ubdsrv
autoreconf -i&& ./configure&& make -j 4&& make install
- add/delete ublk device inside container by unprivileged user
docker exec -u 1001:1001 -ti fedora /bin/bash
#run the following commands inside the above container
bash-5.2$ ublk add -t null --unprivileged
dev id 0: nr_hw_queues 1 queue_depth 128 block size 512 dev_capacity 524288000
max rq size 524288 daemon pid 178 flags 0x62 state LIVE
ublkc: 237:0 ublkb: 259:1 owner: 1001:1001
queue 0: tid 179 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )
target {"dev_size":268435456000,"name":"null","type":0}
bash-5.2$ ls -l /dev/ublk*
crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control
brwx------. 1 1001 1001 259, 1 May 1 04:36 /dev/ublkb0
crwx------. 1 1001 1001 237, 0 May 1 04:36 /dev/ublkc0
bash-5.2$ ublk del -n 0
bash-5.2$ ls -l /dev/ublk*
crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control
- example of ublk in docker:
tests/debug/ublk_docker
make test T=all
make test T=null
make test T=loop
make test T=generic
make test T=generic/001
make test T=null/001
make test T=loop/001 ...
make test T=generic:loop/001:null
ublksrv is running as one daemon process, so most of debug messages won't be shown in terminal. If any issue is observed, please collect log via command of "journalctl | grep ublksrvd"
./configure --enable-debug
can build a debug version of ublk which
dumps lots of runtime debug messages, and can't be used in production
environment, should be for debug purpose only. For debug version of
ublksrv, 'ublk add --debug_mask=0x{MASK}' can control which kind of
debug log dumped, see UBLK_DBG_*
defined in include/ublksrv_utils.h
for each kind of debug log.
API is documented in include/ublksrv.h, and doxygen doc can be generated by running 'make doxygen_doc', the generated html docs are in doc/html.
Any kind of contribution is welcome!
Development is done over github.
Move libublksrv out of ublksrv project, and make it as one standalone repo and name it as libublk.
It is planned to do it when ublk driver UAPI changes(feature addition) is slow down.
nlohmann(include/nlohmann/json.hpp) is from [3], which is covered by MIT license.
The library functions (all code in lib/ directory and include/ublksrv.h) are covered by dual licensed LGPL and MIT, see COPYING.LGPL and LICENSE.
qcow2 target code is covered by GPL-2.0, see COPYING.
All other source code are covered by dual licensed GPL and MIT, see COPYING and LICENSE.
[1] | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/block/ublk_drv.c?h=v6.0 |
[2] | https://lore.kernel.org/all/20220318095531.15479-1-xiaoguang.wang@linux.alibaba.com/ |
[3] | https://github.com/nlohmann/json |
[4] | https://github.com/ming1/ubdsrv/blob/master/qcow2/STATUS.rst |
[5] | https://github.com/ming1/ubdsrv/blob/master/qcow2/README.rst |
[6] | https://github.com/ming1/ubdsrv/blob/master/build_with_liburing_src |
[7] | https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/ |