-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blockchain: Convert to full block index in mem. #1229
blockchain: Convert to full block index in mem. #1229
Conversation
f6da30e
to
ac4c088
Compare
Rebased for latest changes to |
Tested this PR yesterday on testnet. Re-synced to the current block as expected, didn't encounter any errors. I also verified the expected tip via https://testnet.dcrdata.org/ |
Question: how does this affect memory usage? The descriptions mentions several times about having data always in memory... Faster load times are nice but I hope it's not at the expense of too much memory usage as I'd question how salable that becomes as the blockchain grows over time. |
From the PR description:
In terms of concrete usage, I've been running a node with it since slightly before I created the PR and it is using roughly 330MB. |
After running this PR for a couple days: Mainnet:
Testnet2:
|
As a point of comparison, a couple of long running nodes on
Running with this PR, for obviously a much shorter period (~3 days), on a couple of other nodes (no incoming connections on these nodes):
|
Both on testnet, running the compiled binary, both after a few hours running (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t'ok testnet2 miner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
a9e005d
to
380befb
Compare
This reworks the block index code such that it loads all of the headers in the main chain at startup and constructs the full block index accordingly. Since the full index from the current best tip all the way back to the genesis block is now guaranteed to be in memory, this also removes all code related to dynamically loading the nodes and updates some of the logic to take advantage of the fact traversing the block index can no longer potentially fail. There are also many more optimizations and simplifications that can be made in the future as a result of this. Due to removing all of the extra overhead of tracking the dynamic state, and ensuring the block node structs are aligned to eliminate extra padding, the end result of a fully populated block index now takes quite a bit less memory than the previous dynamically loaded version. It also speeds up the initial startup process by roughly 2x since it is faster to bulk load the nodes in order as opposed to dynamically loading only the nodes near the tip in backwards order. For example, here is some startup timing information before and after this commit on a node that contains roughly 238,000 blocks: 7200 RPM HDD: ------------- Startup time before this commit: ~7.71s Startup time after this commit: ~3.47s SSD: ---- Startup time before this commit: ~6.34s Startup time after this commit: ~3.51s Some additional benefits are: - Since every block node is in memory, the code which reconstructs headers from block nodes means that all headers can always be served from memory which will be important since the network will be moving to header-based semantics - Several of the error paths can be removed since they are no longer necessary - It is no longer expensive to calculate CSV sequence locks or median times of blocks way in the past - It is much less expensive to calculate the initial states for the various intervals such as the stake and voter version - It will be possible to create much more efficient iteration and simplified views of the overall index An overview of the logic changes are as follows: - Move AncestorNode from blockIndex to blockNode and greatly simplify since it no longer has to deal with the possibility of dynamically loading nodes and related failures - Replace nodeAtHeightFromTopNode from BlockChain with RelativeAncestor on blockNode and define it in terms of AncestorNode - Move CalcPastMedianTime from blockIndex to blockNode and remove no longer necessary test for nil - Remove findNode and replace all of its uses with direct queries of the block index - Remove blockExists and replace all of its uses with direct queries of the block index - Remove all functions and fields related to dynamically loading nodes - children and parentHash fields from blockNode - depNodes from blockIndex - loadBlockNode from blockIndex - PrevNodeFromBlock from blockIndex - {p,P}revNodeFromNode from blockIndex - RemoveNode - Replace all instances of iterating backwards through nodes to directly access the parent now that nodes don't potentially need to be dynamically loaded - Introduce a lookupNode function on blockIndex which allows the initialization code to locklessly query the index - No longer take the chain lock when only access to the block index, which has its own lock, is needed - Removed the error paths from several functions that can no longer fail - getReorganizeNodes - findPrevTestNetDifficulty - sumPurchasedTickets - findStakeVersionPriorNode - Removed all error paths related to node iteration that can no longer fail - Modify FetchUtxoView to return an empty view for the genesis block
380befb
to
fc91d2c
Compare
This reworks the block index code such that it loads all of the headers in the main chain at startup and constructs the full block index accordingly.
Since the full index from the current best tip all the way back to the genesis block is now guaranteed to be in memory, this also removes all code related to dynamically loading the nodes and updates some of the logic to take advantage of the fact traversing the block index can no longer potentially fail. There are also many more optimizations and simplifications that can be made in the future as a result of this.
Due to removing all of the extra overhead of tracking the dynamic state, and ensuring the block node structs are aligned to eliminate extra padding, the end result of a fully populated block index now takes quite a bit less memory than the previous dynamically loaded version.
It also speeds up the initial startup process by roughly 2x since it is faster to bulk load the nodes in order as opposed to dynamically loading only the nodes near the tip in backwards order.
For example, here is some startup timing information before and after this commit on a node that contains roughly 238,000 blocks:
7200 RPM HDD:
Startup time before this commit: ~7.71s
Startup time after this commit: ~3.47s
SSD:
Startup time before this commit: ~6.34s
Startup time after this commit: ~3.51s
Some additional benefits are:
An overview of the logic changes are as follows:
AncestorNode
fromblockIndex
toblockNode
and greatly simplify since it no longer has to deal with the possibility of dynamically loading nodes and related failuresnodeAtHeightFromTopNode
fromBlockChain
withRelativeAncestor
onblockNode
and define it in terms ofAncestorNode
CalcPastMedianTime
fromblockIndex
toblockNode
and remove no longer necessary test for nilfindNode
and replace all of its uses with direct queries of the block indexblockExists
and replace all of its uses with direct queries of the block indexchildren
andparentHash
fields fromblockNode
depNodes
fromblockIndex
loadBlockNode
fromblockIndex
PrevNodeFromBlock
fromblockIndex
{p,P}revNodeFromNode
fromblockIndex
RemoveNode
lookupNode
function onblockIndex
which allows the initialization code to locklessly query the indexgetReorganizeNodes
findPrevTestNetDifficulty
sumPurchasedTickets
findStakeVersionPriorNode
FetchUtxoView
to return an empty view for the genesis blockThis is work towards #1145.