Memory consumption with large Transaction Pool #11503

shunsukew · 2022-05-23T03:49:25Z

Is there an existing issue?

I have searched the existing issues
(but, this might be related to some extent. Uncover memory leaks #4249)

Experiencing problems? Have you tried our Stack Exchange first?

This is not a support question.

Description of bug

Substrate node with large transaction pool limit configuration (for e.g. --pool-limit 65536. larger than default pool limit.) consumes whole Mem (32GB) of the machine when pooled transactions count hits around 20k. Memory usage grows rapidly and reaches 100% of 32GB memory.

Are there any potential issues around Transaction Pool? such as memory leak.

Case 1. Transaction Pool 20k (2022-05-22 21:50:00 ~ 2022-05-22 23:00:00 UTC +8)
Transaction pool

Mem

CPU

Once memory usage hits 100%, machine will become not reachable.

Case 2. Default Transaction Pool Limit (2022-05-22 23:20:00 ~ 2022-05-22 23:40:00 UTC +8)
Transaction Pool

Mem

CPU

(Machine Spec)
CPU optimized machine (Fast CPU)
16 vCPU
32GB Mem
General Purpose SSD - 16KiB IOPS & throughput 250 MiB/s

Steps to reproduce

Set default pool limit --pool-limit more than 20k and have +19k transactions in the pool. (I did this by running Astar node and sync blocks with peers as of 2022/05/23.)

The text was updated successfully, but these errors were encountered:

bkchr · 2022-05-23T08:04:26Z

Did you also changed --pool-kbytes?

shunsukew · 2022-05-23T11:23:42Z

@bkchr Thank you for the comment.
No, I don't. That means default value is used?

--pool-kbytes <COUNT>
            Maximum number of kilobytes of all transactions stored in the pool [default: 20480]

bkchr · 2022-05-30T19:58:17Z

@koute could you may look into this?

koute · 2022-05-31T05:10:11Z

@koute could you may look into this?

Sure; I'm on it.

koute · 2022-05-31T08:17:51Z

The issue doesn't seem to reproduce on a normal Kusama node (or maybe it just needs to be sync'd from scratch; I haven't checked yet), however I think I've managed to reproduce it on the newest astar-collator (I haven't let it run until memory exhaustion, but it looks like the memory's growing). I'm profiling it to see why it is growing.

koute · 2022-05-31T11:00:49Z

@shunsukew For reference, can you provide the exact command line you've used to launch your node?

koute · 2022-05-31T12:47:38Z

So I think I see the memory usage increase, but it's nowhere near as fast as on the screenshots posted by @shunsukew. I'll leave it running overnight (and if it doesn't reproduce I'll try maybe spamming it with fake transactions), however it'd be nice if there was a way I could reproduce it to behave as in the original issue as that would make it a lot easier to investigate.

In the meantime I've also noticed that the Astar node uses the system allocator and doesn't use jemalloc like Polkadot does; this is not good, and it might contribute to the problem. (I could check if I knew how to exactly reproduce it.) I've put up a PR here enabling jemalloc for your node: AstarNetwork/Astar#653

bLd75 · 2022-06-01T08:44:24Z

Hi @koute thank you very much for the PR!

Below are tests made on a collator node with this simple command (before and after change made ~19:15):
/usr/local/bin/astar-collator --collator --rpc-cors all --name collator --base-path /var/lib/astar --state-cache-size 0 --prometheus-external --pool-limit 65536 --port 30333 --chain astar --parachain-id 2006 --telemetry-url 'wss://telemetry.polkadot.io/submit/ 0'
I think node has to be fully sync to reproduce.
Previous data reported was from a public node (archive mode).

Metrics on the same time frame

Transaction queue

RAM (32Gb total) increases fast but doesn't get totally full from the beginning:

CPU consumption doesn't change much but gets higher

Peers number gets unstable

Network traffic increases in huge proportions, the node is sending incredible amount of data

I will test your PR just after as a next step.

shunsukew · 2022-06-01T10:28:13Z

@koute @bLd75
Thank you for the PR and additional information

github-actions bot added the J2-unconfirmed Issue might be valid, but it’s not yet known. label May 23, 2022

koute self-assigned this May 31, 2022

nazar-pc mentioned this issue Jun 1, 2022

High memory usage with RocksDB (ParityDB is fine) #11513

Open

2 tasks

koute mentioned this issue Aug 24, 2023

High CPU usage calling Backend::usage_info paritytech/polkadot-sdk#40

Open

shunsukew changed the title ~~Memory consumption by Transaction Pool~~ Memory consumption with large Transaction Pool Jun 3, 2022

NZT48 mentioned this issue Jun 6, 2022

Memory leak after node finishes syncing with relay chain #11604

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory consumption with large Transaction Pool #11503

Memory consumption with large Transaction Pool #11503

shunsukew commented May 23, 2022 •

edited

Loading

bkchr commented May 23, 2022

shunsukew commented May 23, 2022 •

edited

Loading

bkchr commented May 30, 2022

koute commented May 31, 2022

koute commented May 31, 2022

koute commented May 31, 2022

koute commented May 31, 2022 •

edited

Loading

bLd75 commented Jun 1, 2022

shunsukew commented Jun 1, 2022 •

edited

Loading

Memory consumption with large Transaction Pool #11503

Memory consumption with large Transaction Pool #11503

Comments

shunsukew commented May 23, 2022 • edited Loading

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Description of bug

Steps to reproduce

bkchr commented May 23, 2022

shunsukew commented May 23, 2022 • edited Loading

bkchr commented May 30, 2022

koute commented May 31, 2022

koute commented May 31, 2022

koute commented May 31, 2022

koute commented May 31, 2022 • edited Loading

bLd75 commented Jun 1, 2022

shunsukew commented Jun 1, 2022 • edited Loading

shunsukew commented May 23, 2022 •

edited

Loading

shunsukew commented May 23, 2022 •

edited

Loading

koute commented May 31, 2022 •

edited

Loading

shunsukew commented Jun 1, 2022 •

edited

Loading