FleetAutoscaler keeps alive all TLS connections permanently causing memory leak on webhook server

**What happened**:
Over time my `https` server which hosts the FleetAutoscaler webhook goes OOM. This is caused by 1000s of never dying sockets on the server. This does NOT happen when I call it with cURL or a browser. It only happens with Agones calls the endpoint.

```
/app $ lsof -p $PID | grep socket
...
1       /app/zeus-rest  socket:[289294771]
1       /app/zeus-rest  socket:[289294783]
1       /app/zeus-rest  socket:[289292336]
1       /app/zeus-rest  socket:[289291653]
1       /app/zeus-rest  socket:[289291654]
1       /app/zeus-rest  socket:[289293769]
1       /app/zeus-rest  socket:[289294780]
...

/app $ lsof -p $PID | grep socket | wc -l
6397
/app $ lsof -p $PID | grep socket | wc -l
6403
/app $ lsof -p $PID | grep socket | wc -l
6418
```

**What you expected to happen**:

I expect that when the FleetAutoscaler is called by Agones is either reuses the TLS client or it disconnects it. Keeping it alive and then making a new one seems naughty.

**How to reproduce it (as minimally and precisely as possible)**:
Create a TLS FleetAutoscaler endpoint with keepalive turned on and no timeout specified and watch the sockets multiply.

**Anything else we need to know?**:
I suspect that this could be repaired by adding to 
`pkg/fleetautoscalers/fleetautoscalers.go`
```
var client = http.Client{
	Timeout: 15 * time.Second,
+++	Transport: &http.Transport{
+++                DisableKeepAlives: true,
+++        },
}
```
I fixed it by disabling KeepAlive on the server side. But it took me several hours to figure out the problem because I could not reproduce it with any clients of my own.

**Environment**:
- Agones version: 1.16
- Kubernetes version (use `kubectl version`): 1.21
- Cloud provider or hardware configuration: EKS and Minikube
- Install method (yaml/helm): helm
- Troubleshooting guide log(s):
- Others:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FleetAutoscaler keeps alive all TLS connections permanently causing memory leak on webhook server #2278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FleetAutoscaler keeps alive all TLS connections permanently causing memory leak on webhook server #2278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions