Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(eds/store): cache proofs after first eds recompute on sampling #2429

Merged

Conversation

walldiss
Copy link
Member

@walldiss walldiss commented Jul 5, 2023

Overview

Before this change eds was recomputed twice on sampling:

  1. inside sampling client to verify data integrity
  2. On storing EDS to storage to calculate merkle tree proof nodes.

This PR adds proofsAdder cache, that will store proof nodes for later use with key features:

  • cache could be optionally added to context and will be respected by ipld/shrex getter to populate on eds recompute.
  • cache is safe from second recompute, since it will only allow single recompute write and all consequent recomputes over same cache will be noop.

Also this PR reworks writeEDS to use use proofsAdder instead of creating new blockstore and iterating over all keys in it. This approach speeds up writeEDS even without pre collected cache.

Since there are no more need for blockstore being created, I have simplified writeEDS and removed a lot of redundant code including whole struct of writeSession.

The PR also adds benchmark that uses disk for badger store (existing benchmark uses in-memory datastore instead of badger).

This PR brings significant performance increase. On disk store badger results are:

  • before PR store.Put operation: ~1.5s
  • after PR store.Put with cached proofs: 800ms
  • after PR store.Put without cached proofs: 900ms

@walldiss walldiss self-assigned this Jul 5, 2023
@walldiss walldiss added kind:feat Attached to feature PRs area:storage labels Jul 5, 2023
@walldiss walldiss changed the base branch from main to storage_performance_improvements July 5, 2023 11:53
@walldiss walldiss changed the title feat(share/store): cache proofs after first eds recompute on sampling feat(eds/store): cache proofs after first eds recompute on sampling Jul 5, 2023
@walldiss
Copy link
Member Author

walldiss commented Jul 7, 2023

Tests are broken because of rsmt2d bug. There is an open PR to fix the bug: celestiaorg/rsmt2d#236

@Wondertan
Copy link
Member

Does this solution works with bridge?

@Wondertan
Copy link
Member

Needs support for bridges by passing nmt options through app's block extension code

@walldiss walldiss changed the base branch from storage_performance_improvements to v0.11.0-rc9 August 1, 2023 12:27
@walldiss
Copy link
Member Author

walldiss commented Aug 2, 2023

Added support for bridges and also made it to not pass proofs inside context when it is possible to.

distractedm1nd
distractedm1nd previously approved these changes Aug 2, 2023
Copy link
Collaborator

@distractedm1nd distractedm1nd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful

core/eds.go Outdated Show resolved Hide resolved
share/eds/eds.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2023

Codecov Report

Merging #2429 (0dc1c2a) into release-11-rc9 (2ed901c) will increase coverage by 0.18%.
The diff coverage is 52.75%.

@@                Coverage Diff                 @@
##           release-11-rc9    #2429      +/-   ##
==================================================
+ Coverage           51.75%   51.94%   +0.18%     
==================================================
  Files                 156      156              
  Lines               10010    10039      +29     
==================================================
+ Hits                 5181     5215      +34     
+ Misses               4367     4366       -1     
+ Partials              462      458       -4     
Files Changed Coverage Δ
share/ipld/nmt_adder.go 52.74% <23.80%> (-6.63%) ⬇️
core/exchange.go 34.61% <50.00%> (+0.61%) ⬆️
share/eds/retriever.go 81.35% <57.14%> (-2.37%) ⬇️
core/eds.go 65.51% <62.50%> (-7.82%) ⬇️
share/eds/eds.go 63.88% <69.23%> (+0.16%) ⬆️
core/listener.go 52.21% <100.00%> (+0.86%) ⬆️
share/getters/tee.go 37.83% <100.00%> (+1.72%) ⬆️

... and 4 files with indirect coverage changes

@walldiss walldiss changed the base branch from v0.11.0-rc9 to release-11-rc9 August 4, 2023 14:30
@walldiss walldiss force-pushed the recompute_eds_once_on_sampling branch 2 times, most recently from e16f534 to df1a729 Compare August 4, 2023 15:16
@walldiss walldiss force-pushed the recompute_eds_once_on_sampling branch from df1a729 to 0dc1c2a Compare August 4, 2023 15:26
Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only worry I have is how the surface area of "square construction" has increased once again inside of core package which we tried to decrease with previous changes like using app.ExtendBlock.

Hopefully we can consolidate it again in the future.

share/ipld/nmt_adder.go Show resolved Hide resolved
share/ipld/nmt_adder.go Outdated Show resolved Hide resolved
@walldiss walldiss merged commit 6546252 into celestiaorg:release-11-rc9 Aug 7, 2023
12 of 14 checks passed
renaynay pushed a commit to renaynay/celestia-node that referenced this pull request Aug 23, 2023
…elestiaorg#2429)

## Overview

Before this change eds was recomputed twice on sampling:
1. inside sampling client to verify data integrity
 2. On storing EDS to storage to calculate merkle tree proof nodes.

This PR adds `proofsAdder` cache, that will store proof nodes for later
use with key features:
- cache could be optionally added to context and will be respected by
ipld/shrex getter to populate on eds recompute.
- cache is safe from second recompute, since it will only allow single
recompute write and all consequent recomputes over same cache will be
noop.
 
Also this PR reworks writeEDS to use use `proofsAdder` instead of
creating new blockstore and iterating over all keys in it. This approach
speeds up writeEDS even without pre collected cache.
 
Since there are no more need for blockstore being created, I have
simplified writeEDS and removed a lot of redundant code including whole
struct of `writeSession`.
 
The PR also adds benchmark that uses disk for badger store (existing
benchmark uses in-memory datastore instead of badger).
 
This PR brings significant performance increase. On disk store badger
results are:
  - before PR store.Put operation: **~1.5s**
  - after PR store.Put with cached proofs: **800ms** 
  - after PR store.Put without cached proofs: **900ms**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:storage kind:feat Attached to feature PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants