Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

OOM Westmint #2494

Closed
Maharacha opened this issue Apr 28, 2023 · 8 comments · Fixed by paritytech/substrate#14067
Closed

OOM Westmint #2494

Maharacha opened this issue Apr 28, 2023 · 8 comments · Fixed by paritytech/substrate#14067

Comments

@Maharacha
Copy link
Contributor

Version: 0.9.400
Chain: Westmint
Parameters: --chain westmint --pruning archive --ws-external --ws-port 9966 --ws-max-connections 1000 --rpc-external --rpc-port 9933 --rpc-cors all --rpc-methods Safe --prometheus-external

From resync of both Relay and Para DBs, this is the memory consumtion, currently having 64GB available memory.
image

Log from start to first OOM:
westmint_oom.log

@bkchr
Copy link
Member

bkchr commented Apr 28, 2023

Did you compile the node yourself? What OS version are you using?

@bkchr
Copy link
Member

bkchr commented Apr 28, 2023

Pinned block cache limit reached. Evicting value

This is really weird, looks maybe we accumulate notifications somewhere and thus it keeps all the stuff in memory and we die.

CC @skunert

@Maharacha
Copy link
Contributor Author

Did you compile the node yourself? What OS version are you using?

Using binary from https://github.com/paritytech/cumulus/releases/download/v0.9.400/polkadot-parachain
on

Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.2 LTS
Release:	22.04
Codename:	jammy

@bkchr
Copy link
Member

bkchr commented Apr 28, 2023

Ty! Yeah I think the caching issue is the reason you are running out of memory!

@skunert
Copy link
Contributor

skunert commented Apr 28, 2023

Interesting 🤔

Ty! Yeah I think the caching issue is the reason you are running out of memory!

At first glance, it does not sound like a caching issue but notification hoarding somewhere as you said in your first answer. The cache has a fixed maximum size, after all.

@Maharacha
Copy link
Contributor Author

Maharacha commented Apr 28, 2023

I enabled -ldebug on both relay and para if that helps. This is after a restart. It did not OOM so far.
https://drive.google.com/file/d/12mmUhBwxPn-QGdzSvFQN2aOG7jA-ILtN/view?usp=sharing
image

@skunert
Copy link
Contributor

skunert commented Apr 30, 2023

I have the strong suspicion that this is related to #2495, I will run some more experiments. Also have some ideas on how to fix.

@Maharacha
Copy link
Contributor Author

It is in sync and has been working well for a few days now. Constantly on ~11G memory usage. But it was a roller coaster to get there 😄

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants