Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange space leak in PBFT.ChainState #741

Open
mrBliss opened this issue Dec 16, 2019 · 1 comment
Open

Strange space leak in PBFT.ChainState #741

mrBliss opened this issue Dec 16, 2019 · 1 comment

Comments

@mrBliss
Copy link
Contributor

mrBliss commented Dec 16, 2019

It has been reported tthat IntersectMBO/ouroboros-network@f2a050b introduced a space leak. If we remove the following line, the space leak is gone: https://github.com/input-output-hk/ouroboros-network/blob/f2a050ba9ada3bf3ee2421f5e610947619d28337/ouroboros-consensus/src/Ouroboros/Consensus/Protocol/PBFT/ChainState.hs#L356
Note that the space leak has nothing to do with EBBs themselves! Even with an empty ebbs, we still have the space leak.

Looking at the implementation of pruneEBBsLT: https://github.com/input-output-hk/ouroboros-network/blob/f2a050ba9ada3bf3ee2421f5e610947619d28337/ouroboros-consensus/src/Ouroboros/Consensus/Protocol/PBFT/ChainState.hs#L624
No matter how I try to force anchorSlot cs (bangs, seq, pass it in as an argument, ...), it causes a thunk in the StrictSeq fields. Even if I pattern match on it, ignore it (!) and do Map.map id instead of filtering. Strangely, cs { ebbs = ebbs } doesn't have the leak.

If we add {-# OPTIONS_GHC -O0 #-} to the module, the leak goes away. The leak is there with -O1 and -O2 (-fno-full-laziness doesn't help). This makes me think the leak is caused by a bug in GHC 😱.

You can use the following snippet to quickly detect the leak:

main :: IO ()
main = do
    varCS <- Strict.newTVarWithInvariantM unsafeNoThunks CS.empty
    let go slot@(SlotNo s) = do
          when (s `mod` 1000 == 0) $ print s
          atomically $ modifyTVar varCS $
            CS.append securityParam windowSize (signerForSlot slot)
          go (succ slot)
    go 0
  where
    securityParam = SecurityParam 2
    windowSize = CS.WindowSize 2
    signerForSlot :: SlotNo -> CS.PBftSigner PBftMockCrypto
    signerForSlot slot@(SlotNo s) = CS.PBftSigner
      { pbftSignerSlotNo     = slot
      , pbftSignerGenesisKey = Crypto.VerKeyMockDSIGN (fromIntegral s `mod` 7)
      }

The thunk detection (unsafeNoThunks) will catch the thunk. Profiling the heap confirms that thunk detection is correct.

@mrBliss mrBliss added the technical debt Technical debt label Dec 16, 2019
mrBliss referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
mrBliss referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
mrBliss referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
mrBliss referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
mrBliss referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
iohk-bors bot referenced this issue in IntersectMBO/ouroboros-network Dec 16, 2019
1357: Work around a strange space leak in PBFT.ChainState r=mrBliss a=mrBliss

Includes a workaround for #1356.

Additionally, squash all thunks introduced in the last month(s), so that the tests pass again when the `checktvarinvariant` flag is enabled for the `io-sim-classes` package. This boils down to replacing tuples with records with strict fields and forcing the elements of non-empty list to WHNF.

Co-authored-by: Thomas Winant <thomas@well-typed.com>
@mrBliss
Copy link
Contributor Author

mrBliss commented Jan 31, 2020

More info from a past investigation: I have narrowed it down to GHC's -fstrictness flag. The leak went away with -O1 -fno-strictness.

Nosing through recent GHC tickets, I found https://gitlab.haskell.org/ghc/ghc/issues/16197, which might perhaps be related. One of the comments on that ticket says:

I can reproduce this with 8.6, but not with HEAD, so I think perhaps it is fixed. Hooray.
Indeed we have made some recent changes to the handling of strictness on data constructors:
[..]

So let's hope that those changes indeed fixed the bug 🤞. We'll only now for sure when upgrading to a newer GHC version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants