This is an example program that demonstrates how to use io_uring to drive Linux fuse. The only thing it does is forward a single file to the fuse mount point.
Using io_uring reduces the number of system calls the fuse program makes which should speed it up.
Fuse is sometimes used to export a single file to be used as volume, as well. E.g. vdfuse, s3backer, UrBackup (vhd/vhdz mounting). Recent improvements in loop (async direct-io), fuse and Linux memory management (PR_SET_IO_FLUSHER
) have made this really performant.
With large iodepth it gets good results (backing file on tmpfs):
fio ~/fio/ssd-test.fio
seq-read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1024
rand-read: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1024
seq-write: (g=2): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1024
rand-write: (g=3): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1024
fio-3.21
Starting 4 processes
Jobs: 1 (f=1): [_(3),w(1)][100.0%][w=384MiB/s][w=98.3k IOPS][eta 00m:00s]
seq-read: (groupid=0, jobs=1): err= 0: pid=178089: Sat Nov 7 16:39:44 2020
read: IOPS=165k, BW=643MiB/s (674MB/s)(3500MiB/5443msec)
slat (nsec): min=1000, max=9680.6k, avg=4407.18, stdev=18980.57
clat (usec): min=318, max=33373, avg=6190.66, stdev=3064.92
lat (usec): min=320, max=33379, avg=6195.14, stdev=3066.45
clat percentiles (usec):
| 1.00th=[ 1778], 5.00th=[ 2573], 10.00th=[ 2999], 20.00th=[ 3621],
| 30.00th=[ 4228], 40.00th=[ 4817], 50.00th=[ 5473], 60.00th=[ 6259],
| 70.00th=[ 7308], 80.00th=[ 8586], 90.00th=[10421], 95.00th=[11994],
| 99.00th=[14222], 99.50th=[15401], 99.90th=[26608], 99.95th=[28967],
| 99.99th=[32637]
bw ( KiB/s): min=614544, max=745196, per=100.00%, avg=662270.43, stdev=49887.45, samples=7
iops : min=153636, max=186299, avg=165567.43, stdev=12471.72, samples=7
lat (usec) : 500=0.01%, 750=0.02%, 1000=0.06%
lat (msec) : 2=1.54%, 4=24.67%, 10=61.52%, 20=11.89%, 50=0.30%
cpu : usr=18.83%, sys=43.37%, ctx=84144, majf=0, minf=1036
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=896000,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1024
rand-read: (groupid=1, jobs=1): err= 0: pid=178096: Sat Nov 7 16:39:44 2020
read: IOPS=91.0k, BW=356MiB/s (373MB/s)(3500MiB/9845msec)
slat (nsec): min=1100, max=16238k, avg=9354.69, stdev=93603.08
clat (usec): min=189, max=47250, avg=11192.69, stdev=2888.77
lat (usec): min=192, max=47252, avg=11202.13, stdev=2890.28
clat percentiles (usec):
| 1.00th=[ 6521], 5.00th=[ 7635], 10.00th=[ 8225], 20.00th=[ 8979],
| 30.00th=[ 9634], 40.00th=[10290], 50.00th=[10814], 60.00th=[11469],
| 70.00th=[12125], 80.00th=[13042], 90.00th=[14353], 95.00th=[15664],
| 99.00th=[20317], 99.50th=[22938], 99.90th=[34866], 99.95th=[42730],
| 99.99th=[44303]
bw ( KiB/s): min=333610, max=383504, per=100.00%, avg=365092.00, stdev=16877.24, samples=13
iops : min=83402, max=95876, avg=91272.69, stdev=4219.28, samples=13
lat (usec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.06%, 4=0.16%, 10=34.80%, 20=63.89%, 50=1.06%
cpu : usr=11.62%, sys=27.98%, ctx=128214, majf=0, minf=1037
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=896000,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1024
seq-write: (groupid=2, jobs=1): err= 0: pid=178107: Sat Nov 7 16:39:44 2020
write: IOPS=131k, BW=513MiB/s (538MB/s)(3500MiB/6818msec); 0 zone resets
slat (nsec): min=1100, max=4008.6k, avg=5821.89, stdev=18205.65
clat (usec): min=239, max=21808, avg=7749.25, stdev=3409.43
lat (usec): min=243, max=21809, avg=7755.16, stdev=3411.55
clat percentiles (usec):
| 1.00th=[ 2638], 5.00th=[ 3163], 10.00th=[ 3654], 20.00th=[ 4555],
| 30.00th=[ 5407], 40.00th=[ 6390], 50.00th=[ 7373], 60.00th=[ 8356],
| 70.00th=[ 9372], 80.00th=[10683], 90.00th=[12256], 95.00th=[14091],
| 99.00th=[17171], 99.50th=[17957], 99.90th=[19792], 99.95th=[20317],
| 99.99th=[21103]
bw ( KiB/s): min=479727, max=612688, per=99.28%, avg=521884.56, stdev=40381.06, samples=9
iops : min=119931, max=153172, avg=130470.78, stdev=10095.48, samples=9
lat (usec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.16%, 4=13.54%, 10=61.47%, 20=24.72%, 50=0.08%
cpu : usr=17.10%, sys=39.93%, ctx=112618, majf=0, minf=14
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=0,896000,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1024
rand-write: (groupid=3, jobs=1): err= 0: pid=178115: Sat Nov 7 16:39:44 2020
write: IOPS=103k, BW=403MiB/s (422MB/s)(3500MiB/8695msec); 0 zone resets
slat (nsec): min=1200, max=16351k, avg=8084.77, stdev=67434.88
clat (usec): min=1126, max=38086, avg=9877.97, stdev=2952.08
lat (usec): min=1128, max=38088, avg=9886.14, stdev=2953.72
clat percentiles (usec):
| 1.00th=[ 6325], 5.00th=[ 7373], 10.00th=[ 7701], 20.00th=[ 8094],
| 30.00th=[ 8455], 40.00th=[ 8848], 50.00th=[ 9110], 60.00th=[ 9503],
| 70.00th=[10159], 80.00th=[11076], 90.00th=[12518], 95.00th=[14615],
| 99.00th=[23725], 99.50th=[26870], 99.90th=[32637], 99.95th=[35390],
| 99.99th=[36963]
bw ( KiB/s): min=340472, max=447488, per=99.79%, avg=411309.00, stdev=31900.28, samples=17
iops : min=85118, max=111872, avg=102827.24, stdev=7975.05, samples=17
lat (msec) : 2=0.04%, 4=0.28%, 10=67.14%, 20=30.73%, 50=1.82%
cpu : usr=13.52%, sys=30.96%, ctx=74227, majf=0, minf=12
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=0,896000,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1024
Run status group 0 (all jobs):
READ: bw=643MiB/s (674MB/s), 643MiB/s-643MiB/s (674MB/s-674MB/s), io=3500MiB (3670MB), run=5443-5443msec
Run status group 1 (all jobs):
READ: bw=356MiB/s (373MB/s), 356MiB/s-356MiB/s (373MB/s-373MB/s), io=3500MiB (3670MB), run=9845-9845msec
Run status group 2 (all jobs):
WRITE: bw=513MiB/s (538MB/s), 513MiB/s-513MiB/s (538MB/s-538MB/s), io=3500MiB (3670MB), run=6818-6818msec
Run status group 3 (all jobs):
WRITE: bw=403MiB/s (422MB/s), 403MiB/s-403MiB/s (422MB/s-422MB/s), io=3500MiB (3670MB), run=8695-8695msec
Disk stats (read/write):
loop0: ios=1363439/1466332, merge=428561/321891, ticks=1868225/1741830, in_queue=3610054, util=94.74%
Need gcc >= 10 for C++ coroutines. Depends on (recent) liburing-dev.
autoreconf --install
./configure
make
Needs a recent Linux kernel (>=5.9) for io_uring functionality and recent losetup
.
FMNT=/media/test
BMNT=/media/bench
mkdir -p "$FMNT"
mkdir -p "$BMNT"
./fuseuring /tmp/backing_file.img "$FMNT" $((500*1024*1024)) 200 5000 1 &
LODEV=$(losetup --find --show "$FMNT/volume" --direct-io=on)
mkfs.ext4 $LODEV || true
mount $LODEV "$BMNT"
losetup -d $LODEV
Or see bench.sh
.