Skip to content

Commit

Permalink
Merge #2532
Browse files Browse the repository at this point in the history
2532: Fix laziness while streaming blocks r=mrBliss a=kderme

Currently we noticed a big space leak, where ghc profiling indicated a peak of >2GB PINNED memory used and ps/top reported >4GB memory used. 

![img1](https://user-images.githubusercontent.com/11467473/90975548-d951bd80-e53d-11ea-96c5-4809c0cee53e.png)


Checking with noUnexpectedThunks in the `LedgerDB` this was reported:

```
"ExtLedgerState (HardForkBlock (': * ByronBlock (': * (ShelleyBlock TPraosStandardCrypto) ('[] *))))","LedgerDB"], 
unexpectedThunkCallStack = [("$dmnoUnexpectedThunks",SrcLoc {srcLocPackage = "ouroboros-consensus-0.1.0.0-inplace", 
srcLocModule = "Ouroboros.Consensus.Ledger.Extended", srcLocFile = "src/Ouroboros/Consensus/Ledger/Extended.hs", 
srcLocStartLine = 76, srcLocStartCol = 10, srcLocEndLine = 76, srcLocEndCol = 79}),("noUnexpectedThunks",SrcLoc 
{srcLocPackage = "cardano-prelude-0.1.0.0-inplace", srcLocModule = "Cardano.Prelude.GHC.Heap.NormalForm.Classy", 
srcLocFile = "src/Cardano/Prelude/GHC/Heap/NormalForm/Classy.hs", srcLocStartLine = 424, srcLocStartCol = 44, 
srcLocEndLine = 424, srcLocEndCol = 70})], unexpectedThunkClosure = Just "ThunkClosure {info = StgInfoTable {entry = 
Nothing, ptrs = 4, nptrs = 0, tipe = THUNK, srtlen = 81781936, code = Nothing}, ptrArgs = 
[0x000000420daeea98,0x000000420daeeb10,0x00000042119f8c68,0x000000420250cb38], dataArgs = []}"}

```

The leak appears when we open the ChainDB and stream blocks from the Immutable DB to the Ledger DB for validation. I noticed that the older the ledger snapshot we start, the bigger the memory peak.

What actually happens is that because of laziness the validation is basically split in two parts: 
- read every block from the db leading to a big memory peak and do all other actions like logging
- apply each block to the ledger db, which is a pure procedure.

Only after the ledger DB application starts, the gc is free to collect the blocks and the PINNED memory starts to decrease after the peak. This is more apparent on this graph

![img2](https://user-images.githubusercontent.com/11467473/90975697-4d409580-e53f-11ea-9952-7ebc06fb1595.png)

`compute`, which does all the ledger transitions fires up only after all PINNED memory is allocated.

The fix forces the ledger transitions to happen strictly. The new memory result:
![img3](https://user-images.githubusercontent.com/11467473/90975761-c5a75680-e53f-11ea-9c41-4844e8b2f4b1.png)

The PINNED memory is quickly garbage collected and never goes above 12MB:
![img4](https://user-images.githubusercontent.com/11467473/90975812-2fbffb80-e540-11ea-9e0c-e7aa99ef93c1.png)

I'll come back with some more details on this.

Co-authored-by: kderme <k.dermenz@gmail.com>
  • Loading branch information
iohk-bors[bot] and kderme authored Aug 24, 2020
2 parents f15a5ab + 6aba584 commit 50955c0
Showing 1 changed file with 5 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -612,6 +612,11 @@ prune db@LedgerDB{..} =
toPrune :: Int
toPrune = ledgerDbCountToPrune ledgerDbParams (Seq.length ledgerDbBlocks)

-- NOTE: we must inline 'prune' otherwise we get unexplained thunks in
-- 'LedgerDB' and thus a space leak. Alternatively, we could disable the
-- @-fstrictness@ optimisation (enabled by default for -O1). See #2532.
{-# INLINE prune #-}

-- | Push an updated ledger state
pushLedgerState :: l -- ^ Updated ledger state
-> r -- ^ Reference to the applied block
Expand Down

0 comments on commit 50955c0

Please sign in to comment.