-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tiller-proxy] Tiller-proxy pod gets killed because memory usage #1052
Comments
Do you have the logs of the pod? I'm guessing we're running out of memory as tiller-proxy tries to load the tarball for linkerd in-memory. Do the chart-repo syncs have any resource limits, as they are also loading tarballs in-memory? |
The logs look something like:
And then the pod gets killed so I guess yes, reading the tarball may cause this. I haven't seen this issue in the chart-repo though. Where does the chart-repo reads the tarball? |
Hmm looking through the code, that doesn't quite match up. We seem to download the repo index https://github.com/kubeapps/kubeapps/blob/cf8c93f77324674369a039fe44e31f9cc3e55f4a/pkg/chart/chart.go#L287, but never get around to downloading the chart https://github.com/kubeapps/kubeapps/blob/cf8c93f77324674369a039fe44e31f9cc3e55f4a/pkg/chart/chart.go#L298. The index is definitely not that large, but maybe there is some memory leak somewhere that increases tiller-proxy memory use over time?
The reason this hasn't surfaced in chart-repo is because we are not setting and resource limits: https://github.com/kubeapps/kubeapps/blob/master/cmd/apprepository-controller/controller.go#L425. |
The size of the stable/linkerd tarball is 3.7K so I would be surprised that this is specific to that chart. Were you able to reproduce this multiple times with linkerd? |
Yes, I was able to reproduce that several times with [Edit] I was able to reproduce it with other (more simple charts) so it seems that the problem is how we read the index.yaml file. [Edit 2] I can confirm that the issue happens when unmarshaling the |
so I tracked down the issue to the line: https://github.com/kubeapps/kubeapps/blob/master/pkg/chart/chart.go#L150 Apparently the We should increase the memory limit in that case. |
The issue only appears for me when installing a chart a second time so I cached the result of parsing the
We should not cache the content of every chart (because that would be worse from a memory usage point of view) so we would need to increase the memory limit anyway. After increasing the limit to 256Mi I am not able to reproduce the issue anymore. |
The yaml library seems to do some duplicate works, it unmarshal yaml bytes to obj, marshal the obj to json, and unmarshal the json to obj again. func Unmarshal(y []byte, o interface{}) error {
vo := reflect.ValueOf(o)
j, err := yamlToJSON(y, &vo)
if err != nil {
return fmt.Errorf("error converting YAML to JSON: %v", err)
}
err = json.Unmarshal(j, o)
if err != nil {
return fmt.Errorf("error unmarshaling JSON: %v", err)
}
return nil
}
func yamlToJSON(y []byte, jsonTarget *reflect.Value) ([]byte, error) {
// Convert the YAML to an object.
var yamlObj interface{}
err := yaml.Unmarshal(y, &yamlObj)
if err != nil {
return nil, err
}
jsonObj, err := convertToJSONableObject(yamlObj, jsonTarget)
if err != nil {
return nil, err
}
// Convert this object to JSON and return the data.
return json.Marshal(jsonObj)
} |
Hi @lingsamuel, it seems that you are running a quite old version of Kubeapps that it's no longer supported (1.11.3-scratch-r0). My recommendation is to upgrade to a newer version to get the latest fixes. |
Hi @andresmgot, I would say related logic have not changed yet. I upgrade kubeapps to latest (2.3.1) but the problem still exists. |
Hi @lingsamuel, I am not able to reproduce that with the latest version. The repository YAML is read in the apprepo-sync job though. I have run a test as well with the bitnami repository (https://charts.bitnami.com/bitnami) but still no problem: |
Bitnami index is 7.3M, but it only contains 91 entries and 9057 versions. It's not the kubeops problem, it's the Here is a test repo: lingsamuel/helm-index-unmarshal-test, the index is generated with 350 entries and 17500 versions, 8.8M. After the unmarshal, total alloc is 271 and gc 6 times. That means occasionally memory peak could be very high. |
thanks for the investigation @lingsamuel, it's indeed useful. Can you verify if the alternative (gopkg.in/yaml.v2) solves the issue? |
I tried this package. It use about 1/2 memory in my case. But unfortunately, for some reasons I don't know, the unmarshalled |
Can you send a PR with your progress changing the library? We can assist you from there to check if we can address that issue. |
On composite struct, original library behaviors differ from ghodss version. It means, it only supports yaml like this: - metadata: # Note this
apiVersion: v1
appVersion: 1.0.0
description: test
name: test
version: 1.0.0
created: "2021-04-15T14:45:24.707638057+08:00"
digest: aa
urls:
- charts/test-1.0.0.tgz instead this: - apiVersion: v1
appVersion: 1.0.0
description: test
name: test
version: 1.0.0
created: "2021-04-15T14:45:24.707638057+08:00"
digest: aa
urls:
- charts/test-1.0.0.tgz But the "inline" tag mentioned in the issue above doesn't exists in helm lib (by the way, pointer composite struct needs a work around: go-yaml/yaml#356). |
Tiller-proxy reaches its memory limit when installing some apps (for example stable/linkerd). The pod gets killed:
We should evaluate if we should increase the maximum memory that the pod can use or if we are misusing memory in the service.
The text was updated successfully, but these errors were encountered: