Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - dumping ledger state takes humungous amounts of memory #3691

Open
angerman opened this issue Mar 8, 2022 · 22 comments
Open

[BUG] - dumping ledger state takes humungous amounts of memory #3691

angerman opened this issue Mar 8, 2022 · 22 comments
Assignees
Labels
performance An Issue Related to the Performance of the Node

Comments

@angerman
Copy link
Contributor

angerman commented Mar 8, 2022

Internal
Internal if an IOHK staff member.

Area
Other Any other topic (Delegation, Ranking, ...).

Summary
Dumping ledger state on macOS takes humungous amount of memory.

Steps to reproduce
On a mac, start a cardano-node instance, then use cardano-cli to dump the ledger state on mainnet.
Observer cardano-node taking ~12G or memory, and cardano-cli another ~30G.

Expected behavior
Hopefully stay within available system memory.

@angerman angerman added the bug Something isn't working label Mar 8, 2022
@angerman angerman changed the title [BUG] - [BUG] - dumping ledger state takes humungous amounts of memory Mar 8, 2022
@Jimbo4350 Jimbo4350 added performance An Issue Related to the Performance of the Node and removed bug Something isn't working labels Mar 8, 2022
@ashisherc
Copy link

ashisherc commented Apr 5, 2022

This is being asked by the community for a long time, It would be best to be able to query specifics from the ledger state dump, eg. only go snapshot? I have heard of some technical limitations with it. But we see that we need to have the specific queries sooner, this dump memory consumption is sky rocketing already

@Jimbo4350
Copy link
Contributor

Closing this. If this is still relevant please reopen.

@ashisherc
Copy link

It is still the same issue, did I miss any PR that updates this?

@AndrewWestberg
Copy link

@Jimbo4350 Please re-open this one. It's still a major issue.

@Jimbo4350
Copy link
Contributor

Jimbo4350 commented Oct 29, 2022

@newhoggy do any of your open PRs address this issue? If they do link the PR here please.

@Jimbo4350 Jimbo4350 reopened this Oct 29, 2022
@AndrewWestberg
Copy link

Just a nice FYI... the ledger state dump in 1.35.4 on mainnet just went over the size that can be held in a normal integer 2^31-1. I now have to re-write a bunch of code since I can no longer parse the ledger state cbor in memory. Many programming languages use INT values to index byte arrays.

If you're parsing the ledger state cbor this way, please check your code. I'm not sure how much time they're going to give us before 1.35.4 is pushed out as a hard-fork requirement.

@JaredCorduan @disassembler

@AndrewWestberg
Copy link

image
Not sure how to fix this without rewriting the guts of the google cbor parser. The internal byte array is over 2gb so it overflows the integer array index.

@kevinhammond
Copy link
Contributor

kevinhammond commented Nov 15, 2022

I suspect this won't be specific to 1.35.4 (the state might be a little smaller in 1.35.3)? If you're using unsigned for indexing you can get to 4GB presumably

@AndrewWestberg
Copy link

@kevinhammond The state is slightly smaller in 1.35.3 as it hasn't broken there (yet). Given that we're actively upgrading to 1.35.4 on mainnet, I'll have to implement a workaround soon. Right now, the solution is to implement arrays of arrays in the Google cbor library I'm using. It's painful, but it's the only option I have for now. unsigned indexes aren't allowed in JVM languages.

We really do need piecemeal queries for all this stuff. I believe db-sync is still using this monolithic ledger state dump as well.

@papacarp
Copy link

1.35.3 is currently over the 2G limit when I tested yesterday.

-rw-------  1 ubuntu ubuntu 1952594762 Nov 11 21:58 ls375.cbor
-rw-------  1 ubuntu ubuntu 2167014930 Nov 15 18:15 lst375.cbor

python cbor2 library still parses it just fine.

@newhoggy
Copy link
Contributor

newhoggy commented Feb 16, 2023

Is querying the entire ledger-state still a thing that is needed?

I understand that the ledger-state was originally meant as a way to quickly get some functionality working. If its possible to provide that functionality by querying for a subset of the ledger-state that's preferred.

In which case, please track this issue: #4140

@newhoggy
Copy link
Contributor

I ran this on mainnet and observed the CLI taking up to just over 6G when using --out-file parameter. Using --out-file dumps the binary and skips the decode.

@ashisherc
Copy link

We also require querying stakeGo, stakeMark from the ledger state. Note that the full stakeMark, stakeGo snapshot, not only the stake amount per pool id.

@rdlrt
Copy link

rdlrt commented Feb 17, 2023

Is querying the entire ledger-state still a thing that is needed?

I think major use case for this was stake snapshot indeed.
However - as I understand - ledger-state still contains off-chain information (rewards, treasury), that might not be available elsewhere from node itself - this is also a blocker (limiting) for solutions downstream (cardano-db-sync is the only project that tries to work with ledger-state (disabling ledger-state query let's go of these informations.
Similarly, other solutions like scrolls/ouros/ogmios/carp/cncli face similar restrictions.

So until the complete equivalent ways to fetch this data from node are available, the downstream solutions that require those features will have to unfortunately depend on ledger-state (even if it's supposed to be used only for debugging) 🙂

@newhoggy
Copy link
Contributor

@rdlrt Can you create new tickets, one for each of the queries that are needed to not rely on ledger-state anymore?

@newhoggy
Copy link
Contributor

@ashisherc does this meet your needs? #4279

@ashisherc
Copy link

@newhoggy thanks for the review, but that's not what I meant. As I mentioned in my previous comment, we rely on full stakeGo/stakeMark snapshot. which means not just pools info, but we also need delegationMap, stakeMap which are part of stakeGo/stakeMark snapshots

@CarlosLopezDeLara
Copy link
Contributor

@ashisherc @rdlrt @AndrewWestberg Can I get your input as users here, please --> #4982

@newhoggy
Copy link
Contributor

Yes please. Dumping the ledger state is not a thing that can easily be optimised, so it's best if we create feature requests for the queries that return the parts of the ledger state that people need.

@newhoggy
Copy link
Contributor

I propose for this issue to be closed and new FRs be created to track each new query.

@rdlrt
Copy link

rdlrt commented Mar 14, 2023

Created new issue #4984 as requested, I did not split it further - but feel free to split them if desired

@AndrewWestberg
Copy link

It's epoch 464 and the cbor binary version of the ledger state I need to parse each epoch has reached 2.26GB

Mux - LocalStateQueryProtocol processing: 580 buffers, size: 2.26 GiB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance An Issue Related to the Performance of the Node
Projects
None yet
Development

No branches or pull requests

9 participants