Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman run: add ability to store files written inside a container in host's tmpfs. #17599

Closed
socketpair opened this issue Feb 21, 2023 · 18 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@socketpair
Copy link

socketpair commented Feb 21, 2023

Feature request description

Software I run in a container scatters different files everywhere. Everything works, but slow because these changes are stored in the same FS where layers of the image are (disk). I need podman to make final overlayfs with workdir and upperdir in tmpfs. I don't need any files stored in the container after its death.

@socketpair socketpair added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 21, 2023
@socketpair socketpair changed the title podman run: add ability to store file written inside container in host's tmpfs. podman run: add ability to store files written inside a container in host's tmpfs. Feb 21, 2023
@rhatdan
Copy link
Member

rhatdan commented Feb 21, 2023

@giuseppe @vrothberg WDYT?

Using tmpfs for this could be risky since you have limited disk space. Specifying where to store the upper/Container directory might be useful.

@socketpair
Copy link
Author

socketpair commented Feb 22, 2023

Yes, I know. I have plenty of memory on the host. The software I run writes many files inside the container in random places (actually installs packages) and then run some process reading these files. I can not preinstall these packages before podman run. I need to make everyting to run as fast as possible, and tmpfs is only option. Right now, I have to run --privileged in order to mount overlayfs+tmpfs inside container for root dirs like /usr, /bin, /etc and so on

@giuseppe
Copy link
Member

could you try if --transient-store is what you are looking for?

$ podman --transient-store run ....

@socketpair
Copy link
Author

@giuseppe
inside container:

# mount
overlay on / type overlay (rw,relatime,
lowerdir=
/var/lib/containers/storage/overlay/l/TVA2BXPFKGVD54KOM7JNLTNIK6:
/var/lib/containers/storage/overlay/l/ABGGV3KRCWNXXQFARGPMDGM22Y:
/var/lib/containers/storage/overlay/l/R72KYFYKHAQMOLTBI5WEEJGOYE,
upperdir=/var/lib/containers/storage/overlay/d5967391751856893e2618aa54f7223ec557b91a3029796bc99f8f38e543a189/diff,
 workdir=/var/lib/containers/storage/overlay/d5967391751856893e2618aa54f7223ec557b91a3029796bc99f8f38e543a189/work,
metacopy=on,volatile)

I added newlines for readability.
As you can see, upperdir and workdir on HDD. Note, podman is run under root user.

@socketpair
Copy link
Author

podman --graphroot might help, but seems this command line disappeared.

$ podman --version
podman version 4.4.1

@giuseppe
Copy link
Member

thanks for verifying that. Another option could be to pull the images in your store on /var/lib then use a different store to create these containers. You need to configure the store in /var/lib as an additional store for that work.

# podman pull fedora
# podman --root /tmp/root --storage-opt AdditionalImageStore=/var/lib/containers/storage run --rm fedora grep "overlay overlay" /proc/self/mountinfo
2636 2598 0:98 / / rw,relatime - overlay overlay rw,context="system_u:object_r:container_file_t:s0:c433,c999",lowerdir=/var/lib/containers/storage/overlay/l/R4BMMC7ZTAKM7US6IT53BTBBCY,upperdir=/tmp/root/overlay/97a79d68c590dbeab36cf9988e661fc86d82fea36a3f6dd7beaa4ba423532105/diff,workdir=/tmp/root/overlay/97a79d68c590dbeab36cf9988e661fc86d82fea36a3f6dd7beaa4ba423532105/work,volatile

@rhatdan
Copy link
Member

rhatdan commented Feb 22, 2023

I like that idea, Perhaps a small blog on it.

@socketpair
Copy link
Author

Unbelievable! It works!

So, I suggest some standard option that do this. Because it's really non-trivial. For example, I still don't understand why it stops working if I remove AdditionalImageStore=.

It will be very useful for podman to mount tmpfs for this case (in separate mount namespace?) + some tmpfs options like size= and others.

@giuseppe
Copy link
Member

if you drop the AdditionalImageStore= option it won't try to use the existing store to read images.

I wouldn't make it too easy for users to end up in this setup, we can document or better as @rhatdan suggested have a blog but users must be sure of what they do. Using tmpfs for the upper layer is an unusual configuration, you can easily run out of memory and the entire container storage is lost on a power cycle.

One thing I'd like to mention is that if you specify --rm to your container, then overlay is mounted with volatile=on as it seems to be in your case, and it already speeds up the I/O in some cases, since any sync/fsync syscall is inhibited. If you play around with the writeback timeouts, you can then force Linux to keep dirty pages longer in memory, getting similar performance as using tmpfs as the backing store.

@socketpair
Copy link
Author

socketpair commented Feb 22, 2023

Hmm. What approach/method is used to inhibit fsyncs ?!

@giuseppe
Copy link
Member

the volatile=on mount option to the overlay file system

@socketpair
Copy link
Author

@giuseppe

  1. do you know how to achieve the same (or similar) on containerd?
  2. why not to add new command line key/config parameter to podman - where to create runtime layer ? By default – in main storage.

@giuseppe
Copy link
Member

  1. no idea
  2. we are investigating that as part of containers/storage. Although it is not tracked upstream. One risk compared to the solution I've proposed is that the layers will still be referenced from the main storage but they could have been dropped from tmpfs causing all sort of corruption in the storage.

Since there is a working solution for the problem, I am closing the issue.

@socketpair
Copy link
Author

socketpair commented Feb 24, 2023

@giuseppe
adding --transient-store yields:

Error: database libpod root directory (staticdir) "/var/lib/containers/storage/libpod" does not match our libpod root directory (staticdir) "/tmp/layer/libpod": database configuration mismatch

Is it a bug? Do I need transient store in this case ?

#16371

The end result of this is that all metadata that is stored related to containers and volumes (but not images) end up on tmpfs.

Does --rm implicitly mean --transient-store ? If it does not, I think it should.

Also, I guess --transient-store mounts some tmpfs for storing data. I want some option to use this directory as last layer storage, as I asked earlier. It will be much easier to people who may want the fastest containers possible.

--transient-store, --transient-store=even-layer ? Or something like.

Now I run podman this way:

storage_dir=$(sudo podman image inspect "$image:$BRANCH" | jq -r '.[0].GraphDriver.Data.UpperDir' | sed -E 's|/overlay/[a-f0-9]+/diff$||')

sudo unshare --mount -- sh -c 'mount none -t tmpfs /tmp && mkdir /tmp/layer && exec "$@"' -- podman run --root /tmp/layer --storage-opt AdditionalImageStore=$storage_dir "$image:$BRANCH" ...

It's not straightforward, but works.

@rhatdan
Copy link
Member

rhatdan commented Feb 24, 2023

I think you need to reset your system to change to a transient store.
podman system reset

@socketpair
Copy link
Author

@rhatdan I can, but these scripts used by many developers in my company. Should they all reset? Why reset is needed ?

@rhatdan
Copy link
Member

rhatdan commented Mar 1, 2023

You want many users to use tmpfs for store, Are these for all containers or only for a few and then use traditional storage for others?

@socketpair
Copy link
Author

socketpair commented Mar 2, 2023

Reasonable question. Unfortunately, only for SOME containers. If I needed all, I would just change main configuration of podman to store layers in tmpfs.

The main idea is to store everything I get using podman image pull on a permanent storage (to survive reboots), and everything is generated during short-lived stateless containers -- in tmpfs. Because IOPS are expensive comparing to memory in my case.

@rhatdan

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 31, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants