Reduce memory usage #3318

whyrusleeping · 2016-10-18T18:32:53Z

As part of our resouce consumption reduction milestore, Lets make an effort to get the idle memory usage of an ipfs node down below 100MB.

Things that could help here are:

peerstore written to disk
providers garbage collection smarter
fewer goroutines per peer connection
bitswap wantlists to disk

READ BEFORE COMMENTING

Please make sure to upgrade to the latest version of go-ipfs before chiming in. Memory usage still needs to be reduced but this gets better every release.

chevdor · 2016-11-04T16:57:40Z

Yes please! On a small instance, it looks like this:

Guess who is the big one taking all the room ? :)

matthewrobertbell · 2016-11-24T15:22:49Z

A nice test for this would be running IPFS on a 512MB VPS. I ran it, using Ubuntu 14.04, running the deamon in the background, and "ipfs get" an 83MB file, the daemon was OOM killed. Testing that the get completes successfully would be nice.

I also got this, which I think is already noted as a bug:

15:19:13.466 ERROR flatfs: too many open files, retrying in 600ms flatfs.go:121

chevdor · 2016-11-24T15:24:16Z

@mattseh the chart I included above is a VPS with 512 MB running pretty much only ipfs and nginx with no swap. IPFS takes way too much, enabling swap is then sadly required.

matthewrobertbell · 2016-11-24T15:27:14Z

Agreed, I only tried running it on the tiny VPS to see what the simplest thing was that would kill the deamon.

Kubuxu · 2017-03-09T23:50:28Z

Node js implementation is separate, please report that on js-ipfs repo.

This repo is about go-ipfs implementation.

skorokithakis · 2017-07-16T15:44:15Z

I'm seeing 700 MB RAM usage on my VPS instance as well, it would be great if this could be lowered.

timthelion · 2017-07-30T06:15:14Z

I get OOM killed even on a system with 4 gigs of free memory.

root@hobbs:/var/log# dmesg | egrep -i 'killed process'
[764907.341661] Out of memory in UB 2046: OOM killed process 31956 (ipfs) score 0 vm:6157228kB, rss:3875436kB, swap:0kB
[2499620.020001] Out of memory in UB 2046: OOM killed process 25440 (ipfs) score 0 vm:4503708kB, rss:3905584kB, swap:0kB
root@hobbs:/var/log# ipfs version
ipfs version 0.4.10
root@hobbs:/var/log#

Is there any limitation to how much ipfs uses? How does the ipfs.io gateway stay alive? Do you just restart it every time it dies?

Kubuxu · 2017-07-30T09:59:35Z

Our gateways are mostly stable. As a note we are working on connection closing which should solve most of this issue.

pors · 2017-08-11T07:02:22Z

I have exactly the same issue on a VPS of the same size. I have swapping on and that is what is happening. My VPS provider is complaining :)

@Kubuxu is there an ETA? I'm happy to help test an early version.

dokterbob · 2017-09-13T07:58:41Z

The only way I have been able to more-or-less stably run ipfs in production was inside a memory-constrained container (systemd cgroup), restarting it everytime it crashed because not having 'enough' memory. This was about half a year ago.

Perhaps this should be considered higher priority as some newer features as it does, fundamentally, affect stability and performance of IPFS in a very bad way.

timthelion · 2017-09-13T08:01:25Z

I agree, this and the overall performance of downloading (which is not great) are the only things I care about from the IPFS project. Sure, there are new features that would be nice, but I can implement those easilly myself by using IPLD, its no show stopper to be missing a new feature when IPLD is as powerfull as it is. But this IS a bit of a show stopper.

…

On 09/13/2017 09:58 AM, Mathijs de Bruin wrote: The only way I have been able to more-or-less stably run ipfs in production was inside a memory-constrained container (systemd cgroup), restarting it everytime it crashed because not having 'enough' memory. This was about half a year ago. Perhaps this should be considered higher priority as some newer features as it does, fundamentally, affect stability and performance of IPFS in a very bad way. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3318 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABU7-BbKwMhirxCN8fD1IxgPjd81nG6_ks5sh4s5gaJpZM4KaIx6>.

kpcyrd · 2017-09-13T14:29:44Z

I'd love to see some improvements as well, I'm currently running high memory instances for ipfs. :)

My memory usage is around 1G and 2G.

pors · 2017-09-14T18:38:29Z

Hey @dokterbob, we meet again :)

skorokithakis · 2017-09-14T18:40:33Z

Don't I get any love, @pors?

pors · 2017-09-14T18:53:01Z

@skorokithakis huh? Scary shit :)

whyrusleeping · 2017-09-20T01:06:03Z

Hey everyone, we identified and resolved a pretty gnarly memory leak in the dht code. The fix was merged into master here: #4251

If youre having issues with memory usage, please try out latest master (and the soon to be tagged 0.4.11-rc2) and let us know how things go.

skorokithakis · 2017-09-20T01:07:52Z

Can we get a build uploaded somewhere, for us plebs? Also, how confident are you that this doesn't contain any show-stopping bugs? We're really really wanting to put a less leaky version to production, but we obviously don't love crashes either :/

whyrusleeping · 2017-09-20T01:10:46Z

Yeah, builds will be uploaded once I cut the next release candidate. We are quite confident there are no show-stopping bugs (otherwise we wouldnt have merged it), but to err on the safe side its best to wait for the final release of 0.4.11

whyrusleeping · 2017-09-20T06:31:31Z

Once dns finishes propogating, the 0.4.11-rc2 builds will be here: https://dist.ipfs.io/go-ipfs/v0.4.11-rc2

The non-dns url is: https://ipfs.io/ipfs/QmXYxv8gK4SE3n1imq1YAyMGVoUDiCPgaSynMqNQXbAEzm/go-ipfs/v0.4.11-rc2

dokterbob · 2017-09-20T08:20:39Z

@pors Nice to run into you again! Still would like to have a proper look at hackpad. How may I contact you? IRC or something?

@whyrusleeping Thanks for another rc. Let's see how this runs. ^^

pors · 2017-09-20T16:24:47Z

@dokterbob you can email me at mark at pors dot net. And we can change to Dutch :)

kpcyrd · 2017-09-20T16:47:36Z

Can we please keep this on topic?

skorokithakis · 2017-09-20T16:49:56Z

I've been running rc2 all day, and memory usage seems much better than before. It's at 16% now whereas it was at 35% before the upgrade, but, given the nature of leaks, we won't know until after a week or so.

whyrusleeping · 2017-09-20T18:49:53Z

@skorokithakis Thanks! Please let us know if you notice any perf regressions, fixing this properly meant putting a bit more logic in a synchronous hot path and we arent yet sure if it will be an issue in real world scenarios.

Calmarius · 2017-10-09T14:08:53Z

More than 4 GB RAM usage here with 0.4.11, according to top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                     
15384 calmari+  20   0 4797260 2,487g    188 S   3,3 86,2  25:54.47 ipfs                                                                                        
   33 root      20   0       0      0      0 S   0,3  0,0   0:10.12 kswapd0                                                                                     
 1114 mysql     20   0 1246948  12356      0 S   0,3  0,4   0:09.18 mysqld                                                                                      
    1 root      20   0  119780   1304    580 S   0,0  0,0   0:01.96 systemd                                                                                     
    2 root      20   0       0      0      0 S   0,0  0,0   0:00.00 kthreadd                                                                                    
    3 root      20   0       0      0      0 S   0,0  0,0   0:00.64 ksoftirqd/0                                                                                 
    5 root       0 -20       0      0      0 S   0,0  0,0   0:00.00 kworker/0:0H                                                                                
    7 root      20   0       0      0      0 S   0,0  0,0   0:05.87 rcu_sched                                                                                   
    8 root      20   0       0      0      0 S   0,0  0,0   0:00.00 rcu_bh                                                                                      
    9 root      rt   0       0      0      0 S   0,0  0,0   0:00.01 migration/0                                                                                 
   10 root      rt   0       0      0      0 S   0,0  0,0   0:00.07 watchdog/0                                                                                  
   11 root      rt   0       0      0      0 S   0,0  0,0   0:00.07 watchdog/1                                                                                  
   12 root      rt   0       0      0      0 S   0,0  0,0   0:00.01 migration/1                                                                                 
   13 root      20   0       0      0      0 S   0,0  0,0   0:01.08 ksoftirqd/1

SSH session became unresponsive, needed to kill the daemon to get my control back.

It should be noted that my node is pinning a few popular JS frameworks, like jQuery and Mathjax. It might be the cause, but I'm not sure.

I cannot run the node all the time this way.

burdakovd · 2018-09-25T19:52:53Z

Here is another datapoint. On a machine with 1.7 GB of RAM and 3G of swap, running only IPFS daemon in server mode and nginx, after 4 days we see 1.6 GB of RAM used and 550 MB of swap space used.

             total       used       free     shared    buffers     cached
Mem:       1730344    1654536      75808         36      15120      54728
-/+ buffers/cache:    1584688     145656
Swap:      3014648     616008    2398640

Version is docker image jbenet/go-ipfs:latest, 0.4.17.

klueq · 2018-09-26T05:32:43Z

Can we have a flag to limit the number of peers and some smart logic to discard bad peers and get good ones? That seems to be the unavoidable reason for high memory footprint.

dokterbob · 2018-09-26T08:37:32Z

@klueq Check out the connection manager. https://github.com/ipfs/go-ipfs/blob/419bfdc20fc68d70ba0ea5dc9d0bed8db16c1c11/docs/config.md#connmgr

klueq · 2018-09-26T17:41:46Z

Cool. Looks like it's already there.

Another suggestion is why not to use UDP? From my limited understanding, all those 800 TCP connections with peers are idle 99% of the time, but they have real memory buffers and other overhead on both sides. Instead, we could send a UDP ping from time to time to check if the peer is online and if we need to transfer some data reliably, we send another UDP message, the peer acks it and we create a temporary TCP channel.

Kubuxu · 2018-09-26T17:53:19Z

There is experimental QUIC transport that uses UDP.
Should roll out wider soon-ish.
https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#quic

whyrusleeping · 2018-09-26T18:01:29Z

The TCP buffers arent the dominating consumer of memory here. Plus, re-establishing a new connection is very much non-trivial.

Stebalien · 2018-09-26T20:53:46Z

The memory issue comes from our internal buffers/state. We're constantly working on improving this but it'll take time. (semi related: libp2p/go-libp2p#438)

rob-deutsch · 2018-10-04T04:47:36Z

This is something I've been interested in recently (see #5530) and a reality that we have to contend with is that Go is memory hoggish, and is getting moreso (golang/go#23687).

[note: I edited this post a lot as I got closer to the heart of the issue, apologies!]

Go is very reluctant to give any memory back to the OS unless the OS very much needs it.

I think that something that could be considered for go-ipfs is: What does it mean when we say 'idle memory less than 100MB'?

This is an important question because it determines how much work should be done on how go-ipfs uses memory, not just how much memory go-ipfs uses.

Some examples are:

Steady-state memory usage is less than 100MB.
E.g. loaded binary = 20MB, stacks = 20MB, heap = 60MB
But the Go runtime will wait until the heap doubles before running the GC. So maybe...
Allocated memory is less than 100MB.
E.g. loaded binary = 20MB, stacks = 20MB, heap = 30MB, garbage = 30MB.
But the Go runtime will hold on to some memory after cleaning garbage, to make future alloc's quicker...
Allocated plus alloc'd cache memory is less than 100MB.
E.g. loaded binary = 20MB, stacks = 20MB, heap = 24MB, garbage = 24MB, alloc'd cache = 12MB.
But the Go runtime doesn't truly give any memory back to the OS. It just marks it MADV_FREE which the OS doesn't act upon straight away....
Resident memory is less than 100MB.
E.g. loaded binary = 20MB, stacks = 20MB, heap = 15MB, garbage = 15MB, alloc'd cache = 10MB, MADV_FREE cache = 20MB.

Or, in table form:

Limit on	Binary	Stacks	Heap	Garbage	Alloc'd Cache	MADV_FREE cache	Total
Steady-state	20	20	60	`60`	`30`	`60`	250
Allocated	20	20	30	30	`15`	`30`	145
Allocated+	20	20	24	24	12	`24`	124
Resident	20	20	15	15	10	20	100

Calmarius · 2018-11-25T16:39:08Z

I'm running IPFS 0.4.17 on my Raspberry PI (using the low power profile if I remember correctly). When it starts it has low memory usage, But that memory usage slowly climbs up as the hours pass. Within 1-2 days the OOM killer usually kills it. The number of connections are low the "ipfs swarm peers" don't fill the terminal window.

So I think something is leaking there. Or in the Go runtime.

whyrusleeping · 2018-11-27T18:48:32Z

@Calmarius Try 0.4.18. There has been significant work towards reducing memory usage. Likely not fully resolved, but should be noticeably better.

dokterbob · 2018-11-27T20:11:46Z

Maybe put a reference to 0.4.18 in the description so we won't have to tell them about it all the time. (By the way kudo's on the good work, resource usage is indeed much more stable - although not nearly there yet!) Whyrusleeping <notifications@github.com> schreef op 27 november 2018 18:49:01 GMT+00:00:

…

@Calmarius Try 0.4.18. There has been significant work towards reducing memory usage. Likely not fully resolved, but should be noticeably better.

-- Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn beknoptheid.

Calmarius · 2018-12-15T19:30:15Z

Yes! Updated to 0.4.18 and it's running without problems for 2 weeks on my rpi. Great progress indeed!

voxsoftware · 2019-05-27T20:24:03Z

0.4.20, using more than 3GB Ram, problem still

lordcirth · 2019-05-27T20:29:54Z

0.4.20, using more than 3GB Ram, problem still

Under what load, and after running for how long? Does it cap at 3GB for you, or continue to grow?

voxsoftware · 2019-05-27T20:40:38Z

I am running about 2 days, but really i still not use too much. Continue growring, now is 3.7GB Ram. Under Ubuntu 18.04 package: go-ipfs. I am on a machine with 9.5 GB Ram, and getting killed processed due to go-ipfs

whyrusleeping · 2019-05-27T20:55:50Z

I really wonder whats causing this. mars (our first bootstrapper node, and arguably the most connected to ipfs node) peaks at just over 4GB, but every time I check, that much memory is not actually in use, its just the go runtime refusing to return the memory to the OS.

@Stebalien can we try running that memory profile dumper thing on some machines? https://gist.github.com/whyrusleeping/b0431561b23a5c1d8b2dfce5526751aa

voxsoftware · 2019-05-27T21:13:30Z

Just now was killed the daemon. This makes go-ipfs unusable. Have js-ipfs the same problem?

Stebalien · 2019-05-27T21:20:16Z

@voxsoftware try disabling the DHT by running the daemon with ipfs daemon --routing=dhtclient. Also, I'd consider upgrading to the latest RC (go-ipfs 0.4.21-rc3) or just wait for the release (likely tonight or tomorrow).

voxsoftware · 2019-05-27T22:45:20Z

I started with --routing=dhtclient. In about 1 hour, now is 850MB memory. I evaluated using ipfs on desktop app, but with this, I see really still unusable for that purpose

skorokithakis · 2019-05-27T23:10:38Z

Same here. I'm thinking of having systemd restart the daemon once a day, which is, unfortunately, the last thing I'm going to try before giving up on IPFS altogether...

lordcirth · 2019-05-27T23:55:08Z

Same here. I'm thinking of having systemd restart the daemon once a day, which is, unfortunately, the last thing I'm going to try before giving up on IPFS altogether...

I set a memory cap in systemd. The memory pressure keeps it's usage down, and if it still goes over, it gets automatically killed and restarted. Works well enough.

voxsoftware · 2019-05-27T23:56:28Z

Same here. I'm thinking of having systemd restart the daemon once a day, which is, unfortunately, the last thing I'm going to try before giving up on IPFS altogether...

I set a memory cap in systemd. The memory pressure keeps it's usage down, and if it still goes over, it gets automatically killed and restarted. Works well enough.

Can you give an example please? An this works ok for Linux backend, but, how about using in a desktop app for example in Windows?

lordcirth · 2019-05-28T00:02:06Z

Same here. I'm thinking of having systemd restart the daemon once a day, which is, unfortunately, the last thing I'm going to try before giving up on IPFS altogether...

I set a memory cap in systemd. The memory pressure keeps it's usage down, and if it still goes over, it gets automatically killed and restarted. Works well enough.

Can you give an example please? An this works ok for Linux backend, but, how about using in a desktop app for example in Windows?

Like so:
https://gist.github.com/lordcirth/378ae7c3a8d2786874d00867098cbad1

As for Windows, dunno. Haven't used it much in a long time.

skorokithakis · 2019-05-28T00:32:41Z

@lordcirth this is extremely helpful, thank you.

Stebalien · 2021-04-22T22:34:42Z

Closing as stale (most of the issues raised here have been addressed, or are recorded in more specific issues).

whyrusleeping added this to the Reduce Resource Consumption milestone Oct 18, 2016

Kubuxu added the status/deferred Conscious decision to pause or backlog label Nov 28, 2016

hsanjuan mentioned this issue Dec 21, 2016

ipfs daemon memory usage grows overtime: killed by OOM after a 10~12 days running #3532

Closed

xloem mentioned this issue Jan 9, 2017

v0.4.0 - Unable to pin large directories ipfs-inactive/support#19

Closed

dokterbob mentioned this issue Sep 13, 2017

Resource Constraints + Limits #1482

Closed

11 tasks

burdakovd mentioned this issue Sep 25, 2018

Restart IPFS daemon once every 24 hours as it is leaking memory burdakovd/dapps.earth#32

Merged

rob-deutsch mentioned this issue Oct 4, 2018

Measuring memory usage #5530

Open

danimesq mentioned this issue Jan 29, 2019

Rebranding: Poodle blurHY/Horizon#20

Closed

Stebalien closed this as completed Apr 22, 2021

Reduce memory usage #3318

Reduce memory usage #3318

Comments

whyrusleeping commented Oct 18, 2016 • edited by Stebalien Loading

chevdor commented Nov 4, 2016

matthewrobertbell commented Nov 24, 2016

chevdor commented Nov 24, 2016

matthewrobertbell commented Nov 24, 2016

Kubuxu commented Mar 9, 2017

skorokithakis commented Jul 16, 2017

timthelion commented Jul 30, 2017

Kubuxu commented Jul 30, 2017

pors commented Aug 11, 2017

dokterbob commented Sep 13, 2017

timthelion commented Sep 13, 2017 via email

kpcyrd commented Sep 13, 2017

pors commented Sep 14, 2017

skorokithakis commented Sep 14, 2017

pors commented Sep 14, 2017

whyrusleeping commented Sep 20, 2017

skorokithakis commented Sep 20, 2017 • edited Loading

whyrusleeping commented Sep 20, 2017

whyrusleeping commented Sep 20, 2017

dokterbob commented Sep 20, 2017

pors commented Sep 20, 2017

kpcyrd commented Sep 20, 2017

skorokithakis commented Sep 20, 2017

whyrusleeping commented Sep 20, 2017

Calmarius commented Oct 9, 2017 • edited Loading

burdakovd commented Sep 25, 2018 • edited Loading

klueq commented Sep 26, 2018

dokterbob commented Sep 26, 2018

klueq commented Sep 26, 2018

Kubuxu commented Sep 26, 2018

whyrusleeping commented Sep 26, 2018

Stebalien commented Sep 26, 2018

rob-deutsch commented Oct 4, 2018 • edited Loading

Calmarius commented Nov 25, 2018

whyrusleeping commented Nov 27, 2018

dokterbob commented Nov 27, 2018 via email

Calmarius commented Dec 15, 2018

voxsoftware commented May 27, 2019

lordcirth commented May 27, 2019

voxsoftware commented May 27, 2019 • edited Loading

whyrusleeping commented May 27, 2019

voxsoftware commented May 27, 2019

Stebalien commented May 27, 2019

voxsoftware commented May 27, 2019

skorokithakis commented May 27, 2019

lordcirth commented May 27, 2019

voxsoftware commented May 27, 2019

lordcirth commented May 28, 2019

skorokithakis commented May 28, 2019

Stebalien commented Apr 22, 2021

whyrusleeping commented Oct 18, 2016 •

edited by Stebalien

Loading

skorokithakis commented Sep 20, 2017 •

edited

Loading

Calmarius commented Oct 9, 2017 •

edited

Loading

burdakovd commented Sep 25, 2018 •

edited

Loading

rob-deutsch commented Oct 4, 2018 •

edited

Loading

voxsoftware commented May 27, 2019 •

edited

Loading