ConnectionManager peer HighWater configuration not honored (# peers spike up => OOM on VPS) #4718

AndreaCensi · 2018-02-19T04:15:32Z

Version information:

go-ipfs version: 0.4.13-
Repo version: 6
System version: amd64/linux
Golang version: go1.9.2

Type:

Bug

Description:

Context: on my system the ipfs memory grows until it is OOM-killed. (I run ipfs on a VPS (linode), with 1 vCPU and 1GB of RAM. )

I have tried all the suggestions that I found by looking for similar issues in the past and stumbled upon the suggestion of limiting the number of peers.

However it seems that the configuration switches for the connection manager are not honored.

I use this configuration:

"Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "20s",
      "HighWater": 20,
      "LowWater": 10,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "DisableRelay": false,
    "EnableRelayHop": false
  }

I would expect the number of peers to be bounded by 20 (or 20 + something), but the peers average to ~60 after a couple of minutes and they spike up to 200 some time later. I suspect it gets even higher, but at that point the instance becomes unresponsive and I cannot access it anymore. Later on, when I log in, I find that ipfs was OOM-killed.

The text was updated successfully, but these errors were encountered:

Stebalien · 2018-02-19T17:55:32Z

Your expectations are pretty much correct. We close open connections at most once every 10 seconds and only close connections that have been open for the grace period (20s) but you shouldn't be spiking up to ~200 peers in a couple of minutes (unless there all connecting within 20 seconds...).

Regardless, this is definitely a bug.

AndreaCensi · 2018-02-19T22:21:15Z

A couple more observations:

I did manage to see the "endgame": after 8 hours, it was at 1.2GB memory usage (RAM and swap), but there were only 97 peers. So, it seems that the memory increases, but it is not proportional to the number of peers (leak?).

I also tried running the server with --routing=dhtclient and the issue remains.

Stebalien · 2018-02-19T22:37:52Z

So, it seems that the memory increases, but it is not proportional to the number of peers

The current release has an issue where it:

Never forgets information about any peer.
Remembers and gossips tons of ephemeral addresses (addresses with ephemeral ports).

This should be fixed in the next release (it has been fixed in master) but we're trying to iron out a few bugs first.

AndreaCensi · 2018-02-19T23:47:09Z

@Stebalien

I did a source install from master (0.4.14-dev).

The memory usage decreased - though it is unclear at this point if it grows indefinitely or not.

However, I still see too many peers connect (oscillating between 100 and 115 at the moment).

AndreaCensi · 2018-02-21T21:48:06Z

After a couple of days running 0.4.14-dev, I can report the following:

the number of peers stabilizes around 35 (good!)
the memory usage grows over time; it is currently at 1GB. It grows slowly so I expect it to last another day or so until it gets OOMed.

MichaelMure · 2019-03-26T20:02:57Z

This issue might be solved now. I regularly saw memory usage growing unbounded over a few days until OOM, but not anymore.

Stebalien · 2019-04-25T22:49:38Z

Late to the party...

@MichaelMure

The issue here is that we don't have any "MaxConns" hard limit. You were probably noticing a different memory leak.

Stebalien · 2019-04-25T22:50:19Z

Actually, reading through this issue, it appears that it is really about other per-peer memory leaks.

(sorry for the noise)

Stebalien added kind/bug A bug in existing code (including security flaws) topic/connection-manager Issues related to Swarm.ConnMgr (connection manager) labels Feb 19, 2018

Stebalien mentioned this issue Dec 12, 2018

HighWater ConnMgr setting not respected #5248

Closed

Stebalien closed this as completed Apr 25, 2019

leerspace mentioned this issue May 1, 2019

Connection counts climbing far past HighWater setting #6286

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConnectionManager peer HighWater configuration not honored (# peers spike up => OOM on VPS) #4718

ConnectionManager peer HighWater configuration not honored (# peers spike up => OOM on VPS) #4718

AndreaCensi commented Feb 19, 2018

Stebalien commented Feb 19, 2018

AndreaCensi commented Feb 19, 2018

Stebalien commented Feb 19, 2018

AndreaCensi commented Feb 19, 2018

AndreaCensi commented Feb 21, 2018

MichaelMure commented Mar 26, 2019

Stebalien commented Apr 25, 2019

Stebalien commented Apr 25, 2019

ConnectionManager peer HighWater configuration not honored (# peers spike up => OOM on VPS) #4718

ConnectionManager peer HighWater configuration not honored (# peers spike up => OOM on VPS) #4718

Comments

AndreaCensi commented Feb 19, 2018

Version information:

Type:

Description:

Stebalien commented Feb 19, 2018

AndreaCensi commented Feb 19, 2018

Stebalien commented Feb 19, 2018

AndreaCensi commented Feb 19, 2018

AndreaCensi commented Feb 21, 2018

MichaelMure commented Mar 26, 2019

Stebalien commented Apr 25, 2019

Stebalien commented Apr 25, 2019