Skip to content

Conversation

@jasagredo
Copy link
Contributor

@jasagredo jasagredo commented Dec 16, 2025

LSM-trees has no way to easily tell the number of entries in a table. With this PR we keep a counter that we modify as we push differences to the tables.

This was requested by the Performance and Tracing team.

Diff.Insert{} -> True
Diff.Delete -> False
)
diffs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're iterating over diffs in the loop above, consider making these calculations there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easier said than done. The above is inside a Vec.create loop 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You decide, if you're expecting thousands of actions might be worth the effort :) This is quite readable as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that create cannot return another value next to the vector, so we have to create the insertions and deletions outside of the create loop.

@jasagredo jasagredo force-pushed the js/resurrect-utxo-count branch 2 times, most recently from a69d66a to 972437d Compare December 19, 2025 08:11
@jasagredo jasagredo moved this from 🏗 In progress to 👀 In review in Consensus Team Backlog Dec 19, 2025
@jasagredo jasagredo changed the base branch from main to js/leaks-v2 December 19, 2025 09:07
@jasagredo jasagredo force-pushed the js/resurrect-utxo-count branch from 972437d to 488eccf Compare December 19, 2025 09:22
@jasagredo jasagredo force-pushed the js/resurrect-utxo-count branch from 488eccf to c017ac5 Compare December 26, 2025 11:10
@jasagredo jasagredo mentioned this pull request Dec 26, 2025
18 tasks
Copy link
Member

@dnadales dnadales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see support for tracking utxo size in LSM 🙌

Some additional questions:

  • Do we need a changelog entry?
  • Would it make sense to add invariants to check that we're computing the size correctly?
  • Should we add tests to check the counting logic works as expected?

, IndexedMemPack (l EmptyMK) (TxOut l)
) =>
Tracer m LedgerDBV2Trace ->
Int ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add some comment on this parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to use an unsigned type for the table size?

Tracer m LedgerDBV2Trace ->
m (ResourceKey m, LedgerTablesHandle m l)
implDuplicate rr t tracer = do
implDuplicate rr sz t tracer = do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can expand this to size or tSize

Tracer m LedgerDBV2Trace -> UTxOTable m -> l mk -> l DiffMK -> m ()
implPushDiffs tracer t _ !st1 =
Tracer m LedgerDBV2Trace -> UTxOTable m -> StrictTVar m Int -> l mk -> l DiffMK -> m ()
implPushDiffs tracer t s _ !st1 =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use a unique name for the size variable throughout the code base?

S.effects s'
Just (item, s'') -> go tb (n - 1) (item : m) s''
(,tbsSize) <$> S.effects s'
Just (item, s'') -> go (tbsSize + 1) tb (n - 1) (item : m) s''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what are we computing here, and why are we incrementing and decrementing the variables here. Maybe some more representative variable names and comments could help.


readUTxOSizeFile :: MonadThrow m => HasFS m h -> FsPath -> ExceptT (SnapshotFailure blk) m Int
readUTxOSizeFile hfs p =
fmap fst $
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to check there's no trailing data?

Also we might want to add some validation that:

  • the size is non-negative
  • the file exists.

Monad.void $ withFile hasFs p (WriteMode MustBeNew) $ \h ->
hPutAll hasFs h $ BS.toLazyByteString $ BS.intDec sz

readUTxOSizeFile :: MonadThrow m => HasFS m h -> FsPath -> ExceptT (SnapshotFailure blk) m Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we want to handle the case in which the utxoSize does not exist for old snapshots? Do we invalidate them?

Diff.Delete -> False
)
diffs
atomically $ modifyTVar s (\x -> x + ins - dels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we simply fold?

  let (ins, dels) = Map.foldl'
        (\(i, d) delta -> case delta of
            Diff.Insert{} -> (i+1, d)
            Diff.Delete -> (i, d+1)
        )
        (0, 0)
        diffs
  atomically $ modifyTVar s (\x -> x + ins - dels)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 👀 In review

Development

Successfully merging this pull request may close these issues.

4 participants