-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingest/ledgerbackend: Debug slow ingestion during catchup in captive-core #3132
Comments
I wrote a small script based on
I think we should increase the buffer size but it's not trivial because there are two constraints that must be met:
This is because some ledgers can exceed allow size and having a memory constraint only would result in a deadlock. I'll try to code a solution for this next week. |
…ore backend (#3187) This commit introduces `bufferedLedgerMetaReader` which decouples buffering and unmarshaling from `stellarCoreRunner` and `CaptiveStellarCore`. `bufferedLedgerMetaReader` fixes multiple issues: * It fixes #3132 by increasing internal buffers' sizes to hold more ledgers. It makes catchup code much faster. * It fixes #3158 - `bufferedLedgerMetaReader` allowed rewriting shutdown code to a much simpler version. Now `bufferedLedgerMetaReader` and `CaptiveStellarCore` listen to a single shutdown signal: `stellarCoreRunner.getProcessExitChan()`. When Stellar-Core process terminates `bufferedLedgerMetaReader.Start` go routine will stop and `CaptiveStellarCore` will return a user friendly error in `PrepareRange` and `GetLedger` methods. When `CaptiveStellarCore.Close()` is called, it kills the Stellar-Core processing triggering shutdown code explained above. * Decouple buffering and unmarshaling into a single struct. This makes `stellarCoreRunner` and `CaptiveStellarCore` simpler. * It fixes a possible OOM issue when network closes a series of large ledgers. In such case `bufferedLedgerMetaReader` will wait for a buffer to be consumed first before reading more ledgers into memory preventing an increased memory usage.
What version are you using?
1.10.0
What did you do?
Start Horizon + Captive Core from empty DB. Wait for state ingestion. Then it should catchup from checkpoint ledger up to the latest ledger.
What did you expect to see?
Ingestion is fast.
What did you see instead?
Every 4-5 ledgers ingestion is slower even when number of ops/changes is comparable to previous ledgers. See my catchup from earlier today:
Ledger 1228494 was ingested in 0.04 but 1228495 with almost the same number of ops/changes in more than 1s!
I think it can be related to buffering of ledgers loaded in captive core backend which I was thinking had been fixed in #2763 but apparently it still can be super slow.
The text was updated successfully, but these errors were encountered: