Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: rebuild HeaderExtension ~35% quicker #2948

Conversation

JosephGoulden
Copy link
Contributor

Hi. Interested to hear thoughts on whether this optimisation is enough to remove the TODOs in this PR? Is this what was anticipated? The speed improvement comes from only loading the headers from the DB once not twice.

The rebuild header extension of the main chain went from 22s to 15s on my machine when this change is made. Added a new unit test for the rebuild function as well.

@hashmap
Copy link
Contributor

hashmap commented Jul 15, 2019

Thanks for the contribution, good work! I'm not sure about TODO removal, there is an emphasis on all header hashes, perhaps it was about finding a different approach, @antiochp may have a better understanding.

@quentinlesceller quentinlesceller changed the base branch from milestone/2.x.x to master July 24, 2019 15:41
@antiochp antiochp self-requested a review July 24, 2019 15:57
Copy link
Member

@antiochp antiochp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh - So this PR cuts out the intermediate step of collecting the header hashes involved, simply maintaining the headers themselves (so we don't need to do the additional lookup of each header by hash).

This is faster short-term but I suspect it will begin to cause issues longer term as we now need to keep all headers in memory as we rebuild the header MMR (and this is more expensive than just keeping the hashes in memory).

Let me think about this a bit more - 35% improvement is maybe worth paying the price of more data in memory).

Lets keep those TODOs in there though - ideally we'd be able to do this in a way that does not require all these headers to be in memory simultaneously.

Let me think about this a bit more and get back to you on this.
There is a couple of other PRs (specifically #2918 and another WIP that I have not yet PR'd) that overlap a bit with these changes.
I'm wondering if these "rebuilds" may not be entirely necessary or could be approached differently.

@antiochp
Copy link
Member

The rebuild header extension of the main chain went from 22s to 15s on my machine when this change is made.

Out of interest - Is this with a debug build or a --release build?

@JosephGoulden
Copy link
Contributor Author

Ok I see what you mean.

I just reran the test based on latest changes from master and seeing an improvement from ~14s to ~10s in the header extension rebuild (was a release build this time.. probably just a debug last time).
Just had a quick look at the memory as well, used a max of 211MB with full headers vs 75MB with just hashes! Now I'm not so sure about merging this.
I've put the TODOs back either way.

@antiochp
Copy link
Member

We tackled the header extension rebuild in #3004. We no longer rebuild the sync MMR from scratch as we can simply take the existing sync MMR, rewind it to a common ancestor (based on the header MMR) and then apply headers to make it consistent with the header MMR.

In the common case this will be no rewind and simply a "fast forward" (using the git analogy) of headers to catch up with the header MMR.

If we do find ourselves in a situation where the sync MMR is on a different fork to the main header chain then we rewind and reapply as necessary. Either way we apply significantly fewer headers than rebuilding from scratch.

Rebuilding the sync MMR after shutting a node down overnight takes less than 1s this way.

@antiochp
Copy link
Member

Closing as #3004 supersedes this.

@antiochp antiochp closed this Aug 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants