Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch live rootfs from squashfs to EROFS #1852

Open
jlebon opened this issue Dec 17, 2024 · 21 comments
Open

Switch live rootfs from squashfs to EROFS #1852

jlebon opened this issue Dec 17, 2024 · 21 comments

Comments

@jlebon
Copy link
Member

jlebon commented Dec 17, 2024

Currently, we ship a rootfs.img CPIO (both as a separate artifact and as part of the live ISO) which contains the rootfs as a squashfs image.

Let's switch it over to use EROFS instead. Since EROFS is already in use by composefs, this reduces the number of read-only filesystem image formats we have to care about.

This work would happen in osbuild since that's where our live ISO is now being built.

@AdamWill
Copy link

AdamWill commented Dec 19, 2024

@jlebon asked me to drop a link to @Conan-Kudo 's work on Fedora Kiwi-built lives here, so see https://pagure.io/fedora-kiwi-descriptions/pull-request/105 . we do also already have an erofs image in the main compose, one of the FEX images for Asahi - https://pagure.io/fedora-kiwi-descriptions/blob/rawhide/f/teams/asahi.xml#_6 .

@Conan-Kudo
Copy link

Note that erofs is nowhere near as capable or performant as squashfs for compression right now, and erofs image builds take a lot longer than squashfs to reach closer to equivalent storage sizes (almost 3x the time).

@hsiangkao
Copy link

hsiangkao commented Dec 20, 2024

Note that erofs is nowhere near as capable or performant as squashfs for compression right now, and erofs image builds take a lot longer than squashfs to reach closer to equivalent storage sizes (almost 3x the time).

By the way, could you test -zlzma,6, -C1048576, -Eall-fragments (note that without -Ededupe) to compare with Squashfs xz -b 1m (because it seems the current case in kiwi)?
This combination is already multi-threaded, and except for BCJ, it's already comparable to Squashfs.

I wonder the image sizes and build speed out of this combination.

@Conan-Kudo
Copy link

I've just added a commit to test it in my pull request. Let's see how it goes.

@hsiangkao
Copy link

I've just added a commit to test it in my pull request. Let's see how it goes.

I know that -zlzma,6 -C131072 -Eall-fragments,dedupe has been tried, yet I guess -C1048576, -Eall-fragments may produce the similar to -C131072 -Eall-fragments,dedupe and it's already multi-threaded now.

@Conan-Kudo
Copy link

Unfortunately it looks like this one is taking way too long.

@hsiangkao
Copy link

Unfortunately it looks like this one is taking way too long.

It seems it finishes?

@Conan-Kudo
Copy link

@hsiangkao
Copy link

hsiangkao commented Dec 20, 2024

No, it timed out: https://artifacts.dev.testing-farm.io/277af8a0-bf57-4499-98c1-e90531d0b43d/

ok, I didn't know how to parse the raw result. btw, what's the current squashfs build time, do you have some log so I could check too?
Also I'm not sure if I need to bother you try more combinations, but I still wonder if
-zlzma,level=6,dictsize=524288 -C524288 -Eall-fragments could finish in time and the image sizes. By default, it uses 8*-C but it could slow down the speed (but squashfs uses the same value as its block size, its dict size is the same as the block size set)
Also could I have a way to test locally too?

@Conan-Kudo
Copy link

Conan-Kudo commented Dec 20, 2024

From a recent job with the current settings: https://artifacts.dev.testing-farm.io/18600fee-4f88-4c4e-940b-97c98960f752/

It took 15 minutes based on this log.

To test it locally, you can do so on any Fedora 41+ system:

$ sudo dnf install kiwi-cli kiwi-systemdeps git
$ git clone --branch reapply-erofs-live https://pagure.io/fedora-kiwi-descriptions.git
$ cd fedora-kiwi-descriptions
$ sudo ./kiwi-build --output-dir=$PWD/tmpoutput --image-type=iso --image-profile=KDE-Desktop-Live --image-release=0 --debug

If you want to test with squashfs, just switch to the rawhide branch.

@hsiangkao
Copy link

but I still wonder if -zlzma,level=6,dictsize=524288 -C524288 -Eall-fragments could finish in time and the image sizes.

Because I'm afraid -C1048576 will still time out, so try -C524288 might be better, also I wonder like to know the dict size impact.

From a recent job with the current settings: https://artifacts.dev.testing-farm.io/18600fee-4f88-4c4e-940b-97c98960f752/

It took 15 minutes based on this log.

Ok.

To test it locally, you can do so on any Fedora 41+ system:

$ sudo dnf install kiwi-cli kiwi-systemdeps git
$ git clone --branch reapply-erofs-live https://pagure.io/fedora-kiwi-descriptions.git
$ cd fedora-kiwi-descriptions
$ sudo ./kiwi-build --output-dir=$PWD/tmpoutput --image-type=iso --image-profile=KDE-Desktop-Live --image-release=0 --debug

If you want to test with squashfs, just switch to the rawhide branch.

Let me try, thanks.

@hsiangkao
Copy link

To test it locally, you can do so on any Fedora 41+ system:

$ sudo dnf install kiwi-cli kiwi-systemdeps git
$ git clone --branch reapply-erofs-live https://pagure.io/fedora-kiwi-descriptions.git
$ cd fedora-kiwi-descriptions
$ sudo ./kiwi-build --output-dir=$PWD/tmpoutput --image-type=iso --image-profile=KDE-Desktop-Live --image-release=0 --debug

If you want to test with squashfs, just switch to the rawhide branch.

Let me try, thanks.

btw, can it work in a container (e.g. docker) or VM?

@Conan-Kudo
Copy link

Conan-Kudo commented Dec 20, 2024

VM yes, Docker-style container environment no.

@hsiangkao
Copy link

hsiangkao commented Dec 20, 2024

From a recent job with the current settings: https://artifacts.dev.testing-farm.io/18600fee-4f88-4c4e-940b-97c98960f752/

It took 15 minutes based on this log.

Another question is that I wonder how many CPUs was this job used? I couldn't find any hint in the log though.

[ DEBUG   ]: 13:47:43 | Looking for mkfs.erofs in /root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
[ DEBUG   ]: 13:47:43 | EXEC: [mkfs.erofs -Eall-fragments -C524288 -z lzma,level=6,dictsize=524288 /var/tmp/kiwi_bthrnl2n /root/fedora-kiwi-descriptions/tmpoutput-build/build/image-root/]
[ DEBUG   ]: 14:34:42 | Creating directory /root/fedora-kiwi-descriptions/tmpoutput-build/live-media.gcmhugj0/LiveOS

I've tried this configuration with a virtual cloud server of Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz * 32 cores, and the result is

2.9G(3097276416) Dec 20 14:35 Fedora.x86_64-Rawhide.iso

I'm trying -C1048576 -z lzma,level=6,dictsize=1048576 now but it shouldn't be significant smaller.

It seems the main bottleneck is although the main process of -Eall-fragments is multi-threaded, but the preprocess to move fragments into the special inode is still single-threaded for now and it takes nearly 20 mins on my test environment.

I think the build performance won't be improved with the latest mkfs soon, I have to work out a full multi-threaded fragments and dedupe first and try again.

@Conan-Kudo
Copy link

Well, it took over an hour and a half on my Framework 16, which has an AMD Ryzen 9 7940HS (16 cores).

@hsiangkao
Copy link

hsiangkao commented Dec 21, 2024

Well, it took over an hour and a half on my Framework 16, which has an AMD Ryzen 9 7940HS (16 cores).

I've fixed a bug which could cause slow image building if there is much incompressible data, and the time reduced to 20 mins on my test environment (Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz * 32 cores):

-Eall-fragments -C524288 -z lzma,level=6,dictsize=524288 :

[ INFO    ]: 06:45:32 | Packing system into dracut live ISO type: dmsquash
[ DEBUG   ]: 06:45:32 | Looking for mkfs.erofs in /root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
[ DEBUG   ]: 06:45:32 | EXEC: [mkfs.erofs -Eall-fragments -C524288 -z lzma,level=6,dictsize=524288 /var/tmp/kiwi_daobb3zm /root/fedora-kiwi-descriptions/tmpoutput-bui
ld/build/image-root/]
[ DEBUG   ]: 07:04:40 | Creating directory /root/fedora-kiwi-descriptions/tmpoutput-build/live-media.jcb_9tq9/LiveOS
[ INFO    ]: 07:04:44 | Creating live ISO image

The result of image size is

2.9G(3096842240) Dec 21 07:05 Fedora.x86_64-Rawhide.iso

The mkfs version can be checked out as
git clone git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git -b experiemental
https://lore.kernel.org/r/20241220143859.643175-2-hsiangkao@linux.alibaba.com

need ./configure --enable-multithreading

If you're interested, you could try out too. There is still some trick to reduce time even without working on multi-threaded fragments and dedupe even further. I will address this weekend.

@hsiangkao
Copy link

@Conan-Kudo how do I check if the produced data is correct?
tmpoutput-build/build/image-root should be the same as the data inside LiveOS/squashfs.img?

@Conan-Kudo
Copy link

Yes.

@hsiangkao
Copy link

Yes.

But I could find a file(usr/lib/sysimage/rpm/rpmdb.sqlite-shm) is different in tmpoutput-build/build/image-root, the other files are the same:

b54a9455555aaa64b81c40eaaee9d805  mnt/usr/lib/sysimage/rpm/rpmdb.sqlite-shm < --- erofs one
b7c14ec6110fa820ca6b65f5aec85911  tmpoutput-build/build/image-root/usr/lib/sysimage/rpm/rpmdb.sqlite-shm

and the timestamp of rpmdb.sqlite-shm is the same as Dec 21 07:05 Fedora.x86_64-Rawhide.iso, but it seems impossible if it's not be updated.

-rw-r--r--. 1 root root 120836096 Dec 21 06:44 rpmdb.sqlite
-rw-r--r--. 1 root root     32768 Dec 21 07:05 rpmdb.sqlite-shm
-rw-r--r--. 1 root root         0 Dec 21 06:44 rpmdb.sqlite-wal

Can this file be changed?

@Conan-Kudo
Copy link

The -shm and -wal files are cache files that change after each access of the rpmdb. Since in-tree rpm commands are used after the erofs image is created, those files wind up changing.

@Conan-Kudo
Copy link

To state simply, it's nothing to worry about. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants