-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Fast Launch #142
Conversation
Code as supplied by ZULUPooL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cACK
Amazing work @cdonnachie! Thanks for all your contributions! A fix for this problem has been a long time coming. |
Credit here goes to @ZULUPooL, he did the coding and I just created the PR |
with this PR applied: without (develop at 5e756e2): am i missing something here? |
This corrects testnet load time more than mainnet. With this PR applied on Testnet: without (develop at af42429): Current code base took 54 hours to load testnet, with this PR ~2 minutes. |
Can we get a few more eyeballs on this? It would be great to get it reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the concept of this PR, but need one of @digicontributer , @JaredTate or @SmartArray to also have a look.
cACK
Thanks for the review @gto90! Any thoughts @JaredTate, @digicontributer or @SmartArray? |
Testnet loads very slowly. hope someone can take a look at this pr. |
One solution would be to check out the source branch for this PR (142) and test it. If you can reply in this PR with your findings that would be great. |
I have been running this for over a month on a Raspberry Pi 4 8GB. Before this fix, running a testnet node on the Pi was impossible. It never finished syncing because it took more than 15 seconds to verify each historic block meaning it was forever playing catch up. This has completely solved the issue. Starting up a testnet node now takes a matter of minutes, and it syncs to 100% within a few hours. The difference is night and day. And it helps makes the testnet network considerably more usable again for developers. @SmartArray @digicontributer @JaredTate Could you please help by reviewing this? It would be great to get this merged so we can release RC3. |
We would still need testing based on Windows and Mac. Can anyone help with that? |
@@ -239,6 +239,10 @@ unsigned int GetNextWorkRequiredV4(const CBlockIndex* pindexLast, const Consensu | |||
{ | |||
bnNew *= (100 + params.nLocalTargetAdjustment); | |||
bnNew /= 100; | |||
if (i % 16 == 0 && bnNew > UintToArith256(params.powLimit)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add some comments regarding the this logic and mod 16. This is consensus code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testnet this loop could be in the thousands of iterations, since bnNew continues to be incremented by params.nLocalTargetAdjustment it's value may become greater than params.powLimit which the code after the loop already checks for and will set it to params.powLimit on line 249. So may as well break out the loop at the point bnNew is greater than params.powLimit. It will not break consensus as it's not altering what will be determined as the bnNew value.
if (bnNew > UintToArith256(params.powLimit))
{
bnNew = UintToArith256(params.powLimit);
}
For the the i % 16 it's really limiting this check to every 16 iterations to reduce CPU usage of checking each time if bnNew > powLimit which will incur CPU resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo Check (i % 16 == 0): This part of the condition checks if the loop's iterator i is a multiple of 16. It essentially acts as a periodic trigger within the loop, activating the subsequent code block every 16 iterations or blocks.
Difficulty Check (bnNew > UintToArith256(params.powLimit)): This condition checks if the current calculated difficulty (bnNew) is greater than maximum difficulty (UintToArith256(params.powLimit)).
When both conditions are met (every 16 iterations and if the calculated difficulty exceeds the maximum limit), the specific code block following this line will execute.
So is this happening for every 16 blocks total? Or just every 16 blocks per algo? It's per algo correct? Are we potentially opening up the vulnerability of an unchecked 15 block reorg? Just my threat brain asking what if questions.
26 seconds to load. normally more than 24 hours. On testnet. IMG_0074.mov |
@Jongjan88 Thanks for doing this. Can you attach this video on PR 140 too so we know it applies there as well? |
OFC |
Looks like we need testing on Mac. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cACK (tested but did not review)
When I tested RC2 when it initially came out, it would take 30+ hours to compile and synch, and worse another 30+ hours when shutting down and restarting. When testing the current develop branch without the changes in PR #140 and PR #142, it would take at least 14 hours to completely synch and same if shutdown and restarted. When adding the files from PR #140 and PR #142 it took 25 minutes for the initial synch to fully complete. Although I am not getting the 2-5 minute synch times that others are seeing, the dramatic improvement to full launch is almost unbelievable. Shutting down and restarting would only take 1 minute to resynch as well. This seems like a significant improvement, especially for testnet. Hoping this gets thoroughly reviewed as it would be a major milestone for testnet if implemented.
Was anyone able to test this on the Mac? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not shure about my comments - but i guess i am right - i might be wrong - i just saw the code the first time and do not understand a lot of it yet - its just what i saw in half an hour looking at the changes
@@ -343,6 +347,25 @@ const CBlockIndex* GetLastBlockIndexForAlgo(const CBlockIndex* pindex, const Con | |||
return nullptr; | |||
} | |||
|
|||
const CBlockIndex* GetLastBlockIndexForAlgoFast(const CBlockIndex* pindex, const Consensus::Params& params, int algo) | |||
{ | |||
for (; pindex; pindex = pindex->lastAlgoBlocks[algo]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the algo is not the same as in pindex it returns a null pointer - row 354 raises then a null ptr exception i guess
|
||
if (pindexNew->pprev) { | ||
for (unsigned i = 0; i < NUM_ALGOS_IMPL; i++) | ||
pindexNew->lastAlgoBlocks[i] = pindexNew->pprev->lastAlgoBlocks[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this loop is useless - if "i" is not the GetAlgo() there are only null ptr assignments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and by the way - its not best practice to populate a array field of another instance on a foreign method
|
||
if (pindex->pprev) { | ||
for (unsigned i = 0; i < NUM_ALGOS_IMPL; i++) | ||
pindex->lastAlgoBlocks[i] = pindex->pprev->lastAlgoBlocks[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also useless - only null ptr in the array for the other algos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@otherdeniz We don't claim to have perfect code for tweaking other people's software. We came up with this quick solution ourselves, we used it for our own needs, and it works for us. We shared it with developers. Developers can, if they wish, understand the idea and rewrite this code, taking into account their understanding of program structure and acceptable methods.
First off thank you for submitting this PR & taking the time to make this improvement! And also thank you to everyone taking the time to review it. I am all for speeding things up, especially the core wallet loading time. My fear is that we don't want to reduce security somehow & increase attack vectors, so thanks again for all the eyes on this. So if I understand this correctly, the new "GetLastBlockIndexForAlgoFast" function replaces "GetLastBlockIndexForAlgo." If I see this correctly the main difference between GetLastBlockIndexForAlgoFast and the original function is the use of the lastAlgoBlocks array, which stores the last block index for each algorithm, thus allowing the new function to jump directly to the previous block of the same algorithm without checking every block from every other algo. In contrast, the old function would check the block of all algos, not just the individual algo, wasting a lot of CPU power & slowing down startup. My main question with all this is this line of code: If I understand this line correctly: Modulo Check (i % 16 == 0): This part of the condition checks if the loop's iterator i is a multiple of 16. It essentially acts as a periodic trigger within the loop, activating the subsequent code block every 16 iterations or 16 blocks? Does this mean we only get a startup diff check every 16 blocks as it loops through for each algo? I need to back trace this logic a bit more if I understand it fully. Difficulty Check (bnNew > UintToArith256(params.powLimit)): This condition checks if the current calculated difficulty (bnNew) is greater than the blockchain's predefined maximum difficulty (UintToArith256(params.powLimit)). My fear here is are we opening ourselves up to a short term 51% attack somehow by allowing a shallow 15 block reorg that is never checked on a client start up. Granted this is a launch speed improvement, but I would love to verify this is not going to be an issue. Has anyone test mined all 5 algos with these changes? Also @otherdeniz brought up great points for the code below about "only null ptr in the array for the other algos".
|
Has anyone test mined this? @cdonnachie @Jongjan88 @ZULUPooL @gto90 |
Thank you for this @JaredTate , I am also going to dive back in and take a deeper look over the weekend. |
Upon working with @JaredTate I believe this needs a deeper review.
Correct me if I am wrong, but doesn't this fast start code only run when using testnet? It doesn't really benefit mainnet as far as I understand, and was never meant to. What it does do is make it possible to actually start up a testnet node in a reasonable time. |
Great question, the changes as made, especially the change to GetLastBlockIndexForAlgoFast in pow.cpp inside GetNextWorkRequiredV4 will most definitely apply to main net as I am reading the code. If the intent is for this only to be applicable to testnet, then this needs to be written differently. |
Does anybody here have the ability to mine on testnet? If so would you mine doing it, even if only temporaily? cc @cdonnachie @jchappelow I have tried to do it but am receiving errors, though I am probably doing something wrong. @Jongjan88 and I are seeing some strange behaviour on the testnet. We think it might because noone has mined for some time. |
I've started testnet mining, everything works so far. |
After compiling, and giving a few test runs this code cuts the start up load time basically in half for my DGB main net client. I am on an 8-core intel i9 with 32gb ram. The loading estimate percentage is nice to see as well. Seems accurate. The load time was cut for me from 7 min 40 seconds to 3 min and 30 seconds on average. Doing more of a deep dive and back tracing GetLastBlockIndexForAlgoFast fuly I am no longer worried about a problem as I was in my previous post. GetLastBlockIndexForAlgoFast <- GetNextWorkRequiredv4 <- GetNextWorkRequired <- GetBlockProof <-LoadBlockIndex The changes in GetNextWorkRequiredv4 had me concerned, more specifically the per algo- retarget section " if (i % 16 == 0 && bnNew":
Since GetNextWorkRequiredV4 is such a critical part of DGB its important to break down exactly what its doing, Chat GPT provides a great summary: GetNextWorkRequiredV4 Function AnalysisThis C++ function, Function Signature
Initial StepsFinding the First Block in Averaging Interval:
Finding the Previous Block for the Algorithm:
Difficulty AdjustmentLimit Adjustment Step:
Global Retarget:
Per-Algorithm RetargetCalculating
|
@JaredTate Thank you for taking another look at this. You should also test it on testnet as this is where the larger benefit lies, and is the reason the PR was made in the first place. I suggest you try syncing 7.17.3 from scratch and then this. Big diference. |
The reason testnet sees such speedier load times can be explained largely to the changes in the function The DigiByte blockchain now has two distinct C++ functions, 1. Traversal Method
2. Handling Minimum Difficulty BlocksBoth functions manage special minimum difficulty blocks on testnet, but with slightly different approaches:
3. Performance
SummaryWhile both functions aim to find the last block index for a given mining algorithm in the DigiByte blockchain, Definetly a great improvement, but we should mine some real-world blocks with this on main net. |
Thanks for all the analysis @JaredTate. We definitely need folks to do mining on main net and test net. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. Hearing reports this has been mined on main-net. Would be great to get this out in an RC3 for more testing.
@@ -239,6 +239,10 @@ unsigned int GetNextWorkRequiredV4(const CBlockIndex* pindexLast, const Consensu | |||
{ | |||
bnNew *= (100 + params.nLocalTargetAdjustment); | |||
bnNew /= 100; | |||
if (i % 16 == 0 && bnNew > UintToArith256(params.powLimit)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo Check (i % 16 == 0): This part of the condition checks if the loop's iterator i is a multiple of 16. It essentially acts as a periodic trigger within the loop, activating the subsequent code block every 16 iterations or blocks.
Difficulty Check (bnNew > UintToArith256(params.powLimit)): This condition checks if the current calculated difficulty (bnNew) is greater than maximum difficulty (UintToArith256(params.powLimit)).
When both conditions are met (every 16 iterations and if the calculated difficulty exceeds the maximum limit), the specific code block following this line will execute.
So is this happening for every 16 blocks total? Or just every 16 blocks per algo? It's per algo correct? Are we potentially opening up the vulnerability of an unchecked 15 block reorg? Just my threat brain asking what if questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @JaredTate, for doing a live pair review of this code. After stepping through it with you in real-time, I am comfortable that this code is sound and my original concerns have been addressed.
ACK
Code as supplied by ZULUPooL
#130