-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
machine/linux: Switch to virtiofs by default #22920
machine/linux: Switch to virtiofs by default #22920
Conversation
https://gist.github.com/cgwalters/3d22459bae05a89d86b9f0d501d82feb for my notes debugging a 9p issue that spawned this |
One architectural issue we have now is that the qemu process from |
7977f06
to
0fe1655
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Practically speaking is there any reason to support both 9p and virtiofs at the same time? Supporting both just increases maintenance burden. I guess the only downside is that we depend on virtiofsd.
We should just move to virtiofsd and drop 9p |
It'd definitely simplify this patch to just hard switch. Basically I was just trying to be conservative; on the flip side it wasn't really hard either to do both. I am seeing bits about FreeBSD mentioned, which is not supported by virtiofsd (it uses lots of Linux specific namespacing bits, openat2, etc). Digging around slightly I see https://wiki.freebsd.org/VirtFS which looks like a not-yet-upstream implementation, although that may be mostly relevant when acting as a guest, not a host. It looks like... @dfr may be a contact here? |
There will be support for 9p in FreeBSD 15.0 - the implementation is derived from the VirtFS code you reference. It is likely that I will also merge that to FreeBSD 14.2 when that is released later this year. Last week at BSDCan, I heard about a FreeBSD implementation of virtiofs which I believe is compatible with the Linux virtiofs but I don't know the status of that and can't predict when it will make it to a FreeBSD release. |
We don't support feebsd guest so the question is can virtiofsd run on a freebsd host? If not hard forcing virtiofs with qemu will break the freebsd (host) support for podman machine but tbh I have no idea if there is anyone using this in the first place??? |
I have not tried to run virtiofsd on a FreeBSD host and currently 'podman machine' would need some work to support the native hypervisor and that is far down my priority list. The only FreeBSD related thing which is close is #19939 but it doesn't seem likely that will land any time soon and that is related to guest, not host. It is clear that virtiofs is the way forward and that 9p will be phased out at some point. Personally I would prefer that happen later rather than sooner given that 9p works well on FreeBSD and will be part of an upcoming release whereas virtiofs seems further off. I don't know how to estimate the negative impact of maintaining both in the meantime though. |
Does that mean podman machine doesn't work at all today on FreeBSD, and the support that exists (sorry I am just discovering this stuff now) is about running containers directly on the FreeBSD host? Does that actually work? Ahhh....I see https://www.freshports.org/sysutils/ocijail/
To be clear this is only about "podman machine" which is currently obscure even on Linux (that said, we're forcing its usage as part of some bootc workflows). And I suspect podman on FreeBSD is already obscure, so we're talking about an obscure corner of an obscure corner... Here's my proposal: I'll structure things as two patches:
Then anyone who cares could try to argue for a revert of the second patch later, or at least use it as a starting point. |
SGTM |
That works for me. My only use of 'podman machine' is plain vanilla Linux guest on macos host. For anything else, I have FreeBSD hosts and VMs as well as Linux VMs. |
0fe1655
to
0d64ebf
Compare
OK I cleaned this up a bit more, and split it into the promised two separate patches, one to add virtiofs in parallel, and one to switch to it by default and drop 9p. In digging through this, a few other notes:
|
0d64ebf
to
0a4dea2
Compare
0a4dea2
to
d9a8bb4
Compare
if I am using podman machine on Linux now and you switch to virtiofs, what happens to my machine? the update process would allow you to update a client and NOT have to recreate the machine. I don't see how you can remove 9p without that consideration and I also don't think migration is worth it. As for switching to virtiofs as a default, we usually do not add the function and then immediately make it the default. we usually have at least one release off bake time. I could be convinced otherwise. |
The semantics I believe are easy to describe: When podman is updated to a binary that includes this PR, existing machines stay unchanged until you stop/restart them.
There's not much a "migration" per above, it's just a stop/restart because the mounts are done dynamically over SSH.
Well, that's a debate between you and Dan and Paul right? I originally did the PR making it opt in, and it's easy to just drop the top commit here. I could also do a small variant where we make virtiofs the default with an env fallback to 9p. But the argument here for a default switch IMO is:
|
Well, I did say I could be convinced, but it is a departure from the norm; and that is notable. How this applies to RHEL should probably be discussed in further detail, because there are subtleties on Podman machine support there. |
d9a8bb4
to
7c2bc59
Compare
Updates:
|
Since almost no one ever changes the default, it is better to default to virtiofs. Falling back to plan9. |
7c2bc59
to
f28f9e7
Compare
@@ -87,7 +87,7 @@ func GenerateSystemDFilesForVirtiofsMounts(mounts []machine.VirtIoFs) ([]ignitio | |||
mountUnit.Add("Mount", "What", "%s") | |||
mountUnit.Add("Mount", "Where", "%s") | |||
mountUnit.Add("Mount", "Type", "virtiofs") | |||
mountUnit.Add("Mount", "Options", "context=\"system_u:object_r:nfs_t:s0\"") | |||
mountUnit.Add("Mount", "Options", fmt.Sprintf("context=\"%s\"", machine.NFSSELinuxContext)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit use %q
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do; just waiting to see if anyone else has other comments before doing another update and round of CI.
LGTM |
@@ -410,7 +410,12 @@ case "$TEST_FLAVOR" in | |||
install_test_configs | |||
;; | |||
machine-linux) | |||
showrun dnf install -y podman-gvproxy* | |||
showrun dnf install -y podman-gvproxy* virtiofsd | |||
# Bootstrap this link if it isn't yet in the package; xref |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this a bit more, and one thing that will probably show up is static package analyzers will warn about the broken symlink (presumably we won't take a hard dep on virtiofsd).
I don't know how annoying it will be though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I do wonder if we should add a new podman-machine package that just contains the necessary deps as requires and could add the symlink there. That way users don't get broken every time we add a new dep and they can just have the one mostly empty package just to get further packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes sense.
Also note that I still consider @baude to have the final say on this. |
Moving to virtiofsd, while running it inside a user namespace (inside a This is important for |
Btw, regarding FreeBSD, we plan to support virtiofsd on it in the near future, @stefano-garzarella is doing all the required modification on rust-vmm and qemu |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: baude, cgwalters The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks for the lgtm, Brent ! One thing I'd re-emphasize a bit here is two things:
|
With the merge of containers#22920, we now require virtiofsd on Linux for machine mounts. Signed-off-by: Brent Baude <bbaude@redhat.com>
Can we queue this for a 5.1 release? Getting more reports, e.g. osbuild/bootc-image-builder#540 |
This isn't the kind of stuff that should go into a patch release given it requires packaging changes IMO. |
Is that planned to be released to fedora 40 for example? |
Yes |
All minor releases are released to supported Fedora's. |
machine/linux: Use memory-backend-memfd by default
This is prep for using virtiofsd; it has no real
impact otherwise.
Signed-off-by: Colin Walters walters@verbum.org
machine: Also initialize virtiofs mounts
I'm hitting a bug with 9p when trying to transfer large files.
In RHEL at least 9p isn't supported because it's known to have a
lot of design flaws; virtiofsd is the supported and recommended
way to share files between a host and guest.
This patch somewhat hackily sets up parallel virtiofsd mounts
alongside the 9p mounts (just on the host side, but one
can then easily mount in the guest).
This is allowing me to do some side-by-side comparisons
with the exact same machine.
I think what we can do in the short term is make
this a hidden option to enable (both "alongside" and also
fully replacing the 9p code).
Signed-off-by: Colin Walters walters@verbum.org