Fixes for staking code and thread shutdown #505

dimxy · 2021-09-23T18:41:10Z

fixes for this issue #478

…ke it thread_local

jmjatlanta

Code looks good. Comments below are just minor things and simply opinions.

jmjatlanta · 2021-09-24T19:56:53Z

src/komodo_bitcoind.cpp

    segid32 = komodo_stakehash(&hash,address,hashbuf,txid,vout);
-    if ( *numkp >= *maxkp )
+    if ( array.size() >= *maxkp )
    {
        *maxkp += 1000;


I can't see the reason for maxkp. I'm thinking it is to control allocations. If the desired functionality is to grow in chunks of 1000 elements, why not use array.size()?

okay thank you, maxkp can be eliminated and array.size() and capacity() be used

jmjatlanta · 2021-09-24T19:59:49Z

src/komodo_bitcoind.cpp

-    kp->nValue = nValue;
-    kp->scriptPubKey = pk;
-    return(array);
+        //array = (struct komodo_staking *)realloc(array,sizeof(*array) * (*maxkp));


Comments like this make the code difficult to follow. If you remove these lines before committing you lose nothing, as features within git make historical research easy.

jmjatlanta · 2021-09-24T20:02:12Z

src/komodo_bitcoind.cpp

@@ -2564,23 +2567,23 @@ int32_t komodo_staked(CMutableTransaction &txNew,uint32_t nBits,uint32_t *blockt
            fprintf(stderr,"[%s:%d] chain tip changed during staking loop t.%u counter.%d\n",ASSETCHAINS_SYMBOL,nHeight,(uint32_t)time(NULL),i);
            return(0);
        }
-        kp = &array[i];
-        eligible = komodo_stake(0,bnTarget,nHeight,kp->txid,kp->vout,0,(uint32_t)tipindex->nTime+ASSETCHAINS_STAKED_BLOCK_FUTURE_HALF,kp->address,PoSperc);
+        struct komodo_staking &kp = array[i];


structs have been upgraded to first-class citizens now. No need to use that keyword here (unless you're compiling for C).

dimxy · 2021-09-25T09:56:41Z

btw this is generally not good to memset C++ objects (which have their own constructors)
https://github.com/dimxy/komodo/blob/ba1b22a300144c97285906f2abc1486cc17530d9/src/komodo_bitcoind.cpp#L2462

jmjatlanta

Looks good.

src/miner.cpp

Co-authored-by: DeckerSU <support@decker.su>

jmjatlanta

Looks good to me. Thanks @dimxy

TheComputerGenie · 2022-07-12T18:30:58Z

src/miner.cpp

@@ -2188,6 +2188,8 @@ void static BitcoinMiner()
        if (minerThreads != NULL)
        {
            minerThreads->interrupt_all();
+            // std::cout << "Waiting for mining threads to stop..." << std::endl;
+            minerThreads->join_all();    // prevent thread overlapping   


Given the lack of interruption points, using both interrupt_all() and join_all() can (and "randomly" will) bring everything to a grinding halt when between interruptibles and unjoinable.

Good point about lack of interruption points.
Currently we do not have join what leads to inaccurate thread shutdown.
And adding join_all works well in marmara and fixed a crash when a user called 'setgenerate false' and 'setgenerate true' quickly in a sequence.
But I missed an extra interruption_point (from the marmara code) in komodo_waituntilelegible which does a long loop which would make thread shutdowns faster, adding it

and fixed a crash when a user called 'setgenerate false' and 'setgenerate true' quickly in a sequence.

This is actually where adding the join_all() creates the issue. Most specifically, it creates an issue with NN mining and CreateNewBlock (which MCL doesn't use).

Could you explain what issue you mean? I believe joins may create an issue when a thread can never be joined and hangs because it may have a long loop without interruption points or sleep or wait calls. If we have such loops we need to fix this to provide thread graceful shutdown to work.
(I am going to run my dev NN node on this branch for testing)

BitcoinMiner only contains a single interruption_point()
That point comes after the while loop at:

komodo/src/miner.cpp

Line 1969 in dcbf657

while ( GetTime() < B.nTime-2 )

which can hold up to 17.5 minutes, as set by:

komodo/src/miner.cpp

Line 860 in dcbf657

pblock->nTime += (r % (33 - gpucount)*(33 - gpucount));

Ironically, the likelihood of that longest pause is increased by the new "stall reduction" code increasing the possibility for gpucount to reach 0 (assuming the rand hits 1056, which is possible).

Being async, the thought would be that this part would stick on its own and there would be no care; however, in reality, there are about a dozen circumstances where it locks everything (one of those circumstances being that threads are ignorant of each other and on advanced hardware come back with multiple solves when many miner threads are used).

When you look at what "should be" vs "what is", it shouldn't be a problem because NNs "should be" only running one thread; however, this is becoming decreasingly true with more and more NNs seeking to hit smaller and more predictable gaps.

This is a good catch, this loop.
However I believe when a user calls setgenerate false and then setgenerate true without join there will be two running threads for some time and this is basically not good at all (in a staking chain this could even create a crash as there is a static komodo_staking *array var that could be corrupted in this case). So I think threads should be stopped gracefully by joining.
This loop you mentioned has sleep() inside and we can replace it on boost::this_thread::sleep_for function (which allows to interrupt the thread) and we should check other remaining loops in miner.cpp to add interruption points in a similar way

and we can replace it on boost::this_thread::sleep_for function (which allows to interrupt the thread) and we should check other remaining loops in miner.cpp to add interruption points in a similar way

However you want to do it, just want to make sure that you/everyone is aware that doing it as-is will lock up NNs; so, whatever way it's done to protect stakers that could crash needs to be done in such a way as to protect both.

I am not sure about your doubts though.
Deleting a thread object without join is an obvious bug IMO and should be fixed.
Fixing it may add some delay on daemon stopping or setgenerate false but this is not a lock-up if we do this properly.
Checking all loops in miner.cpp...

btw this code does not work at all:

komodo/src/miner.cpp

Line 251 in 91ea37b

boost::this_thread::disable_interruption();

as 'disable_interruption' is a type and to activate it we need to create a local var.
Maybe we should fix this too as it was intended

Haven't fully tested it yet, but (along with the others)

komodo/src/miner.cpp

Line 1989 in 12261b9

boost::this_thread::sleep_for(boost::chrono::seconds(1)); // allow to interrupt

does look like it'll solve the issue of my concern. ty

DeckerSU · 2022-07-14T14:11:04Z

Actually MilliSleep is already defined as:

void MilliSleep(int64_t n)
{
    boost::this_thread::sleep_for(boost::chrono::milliseconds(n));
}

So, i guess no need to change MilliSleep calls on boost::this_thread::sleep_for ... just need to change all sleep(...) in the code (may be not only in the miner) on MilliSleep.

dimxy · 2022-07-14T15:53:56Z

now sleep() replaced on sleep_for()

tonymorony · 2022-09-05T09:38:44Z

these changes implemented in combined PR: #559

Dev

dimxy added 3 commits September 23, 2021 23:00

wait for mining threads to stop

acd2b3b

convert staking array to vector (to correctly destroy members) and ma…

24f7d50

…ke it thread_local

staking logging improved

87ac53d

jmjatlanta previously approved these changes Sep 24, 2021

View reviewed changes

maxkp replaced with array.capacity

f582ad9

dimxy dismissed jmjatlanta’s stale review via f582ad9 September 25, 2021 08:06

'struct' eliminated

ba1b22a

unneeded memset removed

5098d67

jmjatlanta previously approved these changes Feb 22, 2022

View reviewed changes

DeckerSU reviewed Feb 22, 2022

View reviewed changes

src/miner.cpp Outdated Show resolved Hide resolved

dimxy dismissed jmjatlanta’s stale review via dcbf657 July 5, 2022 16:25

Update src/miner.cpp

dcbf657

Co-authored-by: DeckerSU <support@decker.su>

jmjatlanta previously approved these changes Jul 12, 2022

View reviewed changes

TheComputerGenie reviewed Jul 12, 2022

View reviewed changes

dimxy added 2 commits July 13, 2022 01:28

add interruption_point in komodo_waituntilelegible

f8c17ce

change sleeps to interruptible in miner

12261b9

dimxy dismissed jmjatlanta’s stale review via 12261b9 July 14, 2022 12:31

Merge branch 'dev' into dimxy-fix-staking-array

106c9d3

dimxy mentioned this pull request Sep 5, 2022

Combined PR for jmj refactoring #559

Merged

tonymorony closed this Sep 5, 2022

Alrighttt pushed a commit to Alrighttt/komodo that referenced this pull request May 30, 2023

Merge pull request KomodoPlatform#505 from VerusCoin/dev

947d095

Dev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for staking code and thread shutdown #505

Fixes for staking code and thread shutdown #505

dimxy commented Sep 23, 2021

jmjatlanta left a comment

jmjatlanta Sep 24, 2021

dimxy Sep 25, 2021

jmjatlanta Sep 24, 2021

jmjatlanta Sep 24, 2021

dimxy commented Sep 25, 2021

jmjatlanta left a comment

jmjatlanta left a comment

TheComputerGenie Jul 12, 2022

dimxy Jul 12, 2022

TheComputerGenie Jul 12, 2022

dimxy Jul 13, 2022 •

edited

Loading

TheComputerGenie Jul 13, 2022 •

edited

Loading

dimxy Jul 13, 2022

TheComputerGenie Jul 13, 2022 •

edited

Loading

dimxy Jul 14, 2022

dimxy Jul 14, 2022

TheComputerGenie Jul 14, 2022 •

edited

Loading

DeckerSU commented Jul 14, 2022

dimxy commented Jul 14, 2022

tonymorony commented Sep 5, 2022

Fixes for staking code and thread shutdown #505

Fixes for staking code and thread shutdown #505

Conversation

dimxy commented Sep 23, 2021

jmjatlanta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimxy commented Sep 25, 2021

jmjatlanta left a comment

Choose a reason for hiding this comment

jmjatlanta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimxy Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

TheComputerGenie Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheComputerGenie Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheComputerGenie Jul 14, 2022 • edited Loading

Choose a reason for hiding this comment

DeckerSU commented Jul 14, 2022

dimxy commented Jul 14, 2022

tonymorony commented Sep 5, 2022

dimxy Jul 13, 2022 •

edited

Loading

TheComputerGenie Jul 13, 2022 •

edited

Loading

TheComputerGenie Jul 13, 2022 •

edited

Loading

TheComputerGenie Jul 14, 2022 •

edited

Loading