-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: stewardship api doesn't check final leaves #4735
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just not sure how to do this efficiently without having the BMT traverser somehow informing the callback if the netGetter was used or not yet for the address being iterated. Otherwise I would have proposed a solution...
noop := func(leaf swarm.Address) error { return nil } | ||
switch err := s.netTraverser.Traverse(ctx, root, noop); { | ||
fn := func(a swarm.Address) error { | ||
_, err := s.netGetter.RetrieveChunk(ctx, a, swarm.ZeroAddress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this naive approach, but what it ends up doing is retrieving all non-leaf chunks TWICE from the swarm doubling the bandwidth cost of a /stewardship check. The first netGetter call happens while expanding the BMT for the non-leaf chunks, but ALL chunks end up doing this callback resulting in the netGetter (which bypasses the local cache) redundantly pulling most of the chunks a second time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. By my opinion, this can be a comment in the code for the improvement, as the functionality is here with minimal changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, then we might as well allow the traverser to use the normal getter method that uses the localstore and do ALL of the netgetter checks here in the callback? Not many more changes, and eliminates the redundancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The net getter is to be able to get the chunk even if it is not locally available. It would be good to cache the chunk upon retrieval in localstore, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand the netGetter, it ONLY retrieves from the swarm, and doesn't even try the localstore (try it on a locally pinned, but stamp-expired reference, it will fail, while the /bytes and /bzz will still work). My point is that since we're going to netGet every single traversed reference in the callback, then don't make a netTraverser using the netGetter, but use a normal traverser that will use localstore and only go to the swarm if necessary. That will reduce the redundancy of hitting the swarm at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as long as it is certain that the getter based on localstore will fallback to retrieve the chunk from the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retrieval protocol has a singleflight mechanism so this improves efficiency of requests that are fired in succession.
Checklist
Description
Issue