Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmp storage does not get cleaned up when large helm repository index file fails to process #1451

Closed
xeruf opened this issue Apr 18, 2024 · 9 comments · Fixed by #1457
Closed
Labels
area/helm Helm related issues and pull requests area/storage Storage related issues and pull requests bug Something isn't working

Comments

@xeruf
Copy link

xeruf commented Apr 18, 2024

I am fetching the truecharts helmrepository once an hour: https://open.greenhost.net/xeruf/stackspout/-/blob/main/infrastructure/sources/truecharts.yaml?ref_type=heads

but it is fetched more often, and for some reason all old copies are kept in tmp:
image

After a few hours, this pod occupies a few GB and it already went up to 60GB! This is crashing my whole cluster, and I am clueless what the heck to do to fix this.

@stefanprodan
Copy link
Member

We cleanup tmp at the end of each reconciliation, if the files are still there then something blocks the controller from deleting them.

@stefanprodan
Copy link
Member

Also when reporting issue you need to provide which version of Flux are your running, this may be an old buggy version that we no longer support. Please post flux check.

@xeruf
Copy link
Author

xeruf commented Apr 18, 2024

❯ flux check
► checking prerequisites
✗ flux 2.1.2 <2.2.3 (new version is available, please upgrade)
✔ Kubernetes 1.28.2+k3s1 >=1.25.0-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.36.2
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.1.1
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.1.2
► checking crds
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ all checks passed

@stefanprodan
Copy link
Member

If this was an issue in Flux 2.1 then tones of users would have reported it back in 2023. I think the tmp disk denies cleanup from the host. Try mounting an NFS disk for tmp and see if the issue persists there.

@stefanprodan
Copy link
Member

Another test that you could so is set tmp to RAM and check if tmp gets cleared. Here is an example of how to mount a ram disk https://fluxcd.io/flux/installation/configuration/vertical-scaling/#enable-in-memory-kustomize-builds

@souleb
Copy link
Member

souleb commented Apr 19, 2024

do you see any error in the source-controller logs about cleaning up indexes temporary files?

@xeruf
Copy link
Author

xeruf commented Apr 20, 2024

ah, the raising of limits in https://open.greenhost.net/xeruf/stackspout/-/blob/6e645c6abfe378f3ccbcce7f167da9e5133e46c8/overrides/source-controller-patch.yaml did not work:
image

so upon failure of processing, it leaves the temporary file in place

@stefanprodan
Copy link
Member

stefanprodan commented Apr 20, 2024

That patch looks wrong to me, there is no name/namespace in a Kustomize config file nor can you apply such a thing with Flux. See here how you can configure Flux at bootstrap time: https://fluxcd.io/flux/installation/configuration/boostrap-customization/

@xeruf
Copy link
Author

xeruf commented Apr 20, 2024

Thanks for the hint!
Either way, if it fails to process a repo file it should not infinitely keep them.

@stefanprodan stefanprodan changed the title Unreasonable storage use for large helmrepository tmp storage does not get cleaned up when large helm repository index file fails to process Apr 20, 2024
@stefanprodan stefanprodan added bug Something isn't working area/helm Helm related issues and pull requests area/storage Storage related issues and pull requests labels Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm Helm related issues and pull requests area/storage Storage related issues and pull requests bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants