-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atan in difficulty algorithm is potentially non-deterministic and has undesirable difficulty effects #133
Comments
" has some major issues and should be removed" maybe explain what the implications for the coin are. Fluctuating hashrate is a "problem" not visible at the market. |
I've created a histogram of block solve times since block 100000 here, also showing the ideal times (i.e. what we'd expect under a perfectly stable actual network hashrate and ideal difficulty): You can see the full interactive histogram here: https://jagerman.com/graft-solvetime-hist-100000_167788.html (the source code to generate it is at: https://jagerman.com/graft-dist.py) Of particular note is the blue area (i.e. actual solve times) being above the ideal up until 60s and beyond about 400s: The take-away message: we have too many short blocks (under a minute) and long blocks (over 7 minutes), and far too many very long blocks (over 15 minutes). In particular: Blocks <= 0.2min: 8263 (12.2%); expected: 6451.0 (9.52%) i.e. about twice as many blocks as we should see over 8 minutes and more than 5 times as many as we should see over 15 minutes. I should point out that not all of this is due to the v9 difficulty adjustment: we'll still have opportunistic miners (e.g. NiceHash or MoneroOcean) even without it, but the adjustment is making the problem worse. |
Are you sure the atan is not the sole cause of the deviations? (in reference to a comment you made in the other issue, indicating removing the atan would not completely fix it) I'm going to send you the same number of blocks from Masari, a smaller coin that should be less stable, and let's see how it's distribution fairs. |
Is there an equation for your smooth line? This might enable a good metric for measuring how bad hash attacks are, or general algorithm performance. Maybe do a root-mean-square of the difference in the bins. Below shows Masari's results with LWMA on the last 50,000 blocks (since their last fork) which shows much better results. Before this fork, their results were similar to the Graft's. The difference is that they implemented a new POW to stop bad attacks. |
I doubt it is: there's still the possibility of doing the coin hopping without it. But it requires being able to bring a considerable amount of hash power (i.e. NiceHash) online and offline quickly. Without the atan the size of drops should be reduced, reducing the incentive to carry it out, which in turn will feed back into reducing the size of the fluctuations. Masari has been on CN-fast since their last fork which currently isn't nicehashable, but I have been hearing rumours that CN-fast is coming to NiceHash soon, so that may destabilize Masari.
It's the probability density function of an exponential distribution, i.e. λ exp(−λx), scaled by 5*N to match the histogram (where 5 is the bin width of the histogram and N is the number of blocks). λ equals 1/T, where T is the target time (i.e. 120 for graft). |
I've put together some DA simulation results comparing a standard LWMA to an LWMA with the https://jagerman.com/graft-da/diff-plot-lwma-N60_A-dip20_seed-5-log.html There are 9 more simulations (using different random seeds) here: https://jagerman.com/graft-da The difference is particularly apparent if you load two simulations with the same seed in two tabs and switch back and forth between them:
|
The v9 fork at block 68000 added an atan adjustment
atan
to try to bring the blockchain difficulty down faster, to try to solve what ended up being a completely separate issue stemming from Monero code (#116 / #118 - though the precise issue and fix were not known at the time of the v9 fork).The added
atan
adjustment, however, has some major issues and should be removed.The first issue is that IEEE floating point operations only requires precisely defined results for
+
,-
,*
,/
, andsqrt
. (Although C++ doesn't even technically require IEEE math, it's probably safe enough to assume to be present anywhere a graft node will run).atan
, however, isn't required the have the same value under all libraries/compilers/compiler versions/compilation flags. This is a potential bomb waiting to go off. It's very rare, but not impossible: and potentially devastating if it ever occurs.The main problem is that it is possible for different c++ runtimes to produce slightly different
values for the the result of an
atan
. Suppose, for example, that MSVC's C++ runtime and Linux'sstdlibc++ libraries produced slightly different values in the atan call for some particular input
value, and that those values just exactly put the RHS of the following calculation on either side of
an integer boundary on the two systems (so that the integer
target
variable here ends up beingsome value
x
on one system andx+1
on the other):If this ever happens the result is a permanent chain split that is inherently unresolvable: neither side of such a split would ever accept the chain accepted by the other side because the other side is offering blocks with an invalid difficulty. Even if, say, all Windows users deleted the block and resynced they would never accept the chain being offered by linux systems if such a deviation ever manifests, and vice versa. (The only potential ways I could see to clean it up would be to issue a new release that hard codes the difficulty for the specific block height where the deviation occurred).
This code is dangerous and needs to be eliminated or, at the very least, replaced with a value carefully constructed with primitive IEEE safe operations.
Before worrying about that, however, there is a deeper issue here (first brought up by @zawy12 in #119): the
atan
is serving little practical purpose while inducing considerable noise into the network difficulty. Let me explain by starting with the implementation:For notation, let me call
B
the most recent block,B-1
the block before that, etc. So this codecalculates
derivative
as the difficulty of blockB
minus the difficulty of blockB-2
(yes,really: this looks like a bug—the blog post description of the algorithm refers to the change in
difficulty from one block to the next), divided by the mean block time over the last
DIFFICULTY_BLOCKS_COUNT_V8
blocks.To put that value into context, let's suppose that the network is perfectly stable with just some ordinary random perturbations that induce a tiny 0.01% difficulty decrease between this block and
B+2
(which should really beB+1
; see above). That tiny shift means difficulty decreased by 290618.So we end up with
derivative = (2906174509 - 2905883891) / h
, withh = 5993.0/60 = 99.883
. Thusderivative = 2909.5745
.This is a fairly pointless number to stick into an
atan
: it gives a value equal to 99.98% of the maximum value thatatan
can ever return (i.e. π/2). And this was just for a miniscule downward fluctuation in the hashrate. In fact, with 120 second average blocks, just to get this value down below 90% of its π/2 asymptotic value would require diff to drop by less than 1000 (which is for all intents and purposes is nothing since network diff is around 3 billion). So really this whole line:is, for all but the most miniscule drops in difficulty, essentially the same as the far simpler expression:
adjust *= 1.05;
except that the latter doesn't allow for a potential chain split.
But a bigger issue: as zawy pointed in in #119,
adjust *= 1.05;
just seems wrong: the difficulty is going to be adjusted downward by nearly 5% whenever it drops between two blocks by .00001%. That adds a substantial amount of noise to the network hashrate. Tiny downward fluctuations (which even in a perfectly flat actual hashrate will happen 50% of the time) drive difficulty down by 5%, which introduces substantial noise into the difficulty (see @zawy12's charts here: http://wordsgalore.com/diff/index2.html — graft's difficulty is consistently more noisy than other coins with a comparable network hashrate).But there's still another issue here: not only are we getting extra noise, that noise is asymmetric (in stats terminology, it does not have a mean of 0): the adjustment is only ever applied to diff drops by never to diff increases. What this means is that it actually makes graft mining emissions consistently around 2% too fast. Here are the elapsed times for 720 block emissions over the last 21600 (= 30 * 720) blocks:
and so on (full list at https://jagerman.com/graft-emission-rate.txt; source code to generate it at https://jagerman.com/graft-emission-rate.py). From block 68000 (i.e. the v9 fork) to 113448 (the current top block as of this writing) the mean block time is 118.02 seconds -- i.e. 1.7% faster than the target 120. This isn't caused by a few outliers, but is (as you can see above) consistent day after day.
(Edit: this has gotten a bit worse since I originally wrote it; the average is now more like 2.5% too fast; the emission link above has been updated with results up to block 167788).
This is happening due to the asymmetric nature of the
atan
adjustment: it is only applied on diff falling, not diff rising. From a theoretical point of view, the bias here is not unexpected: imagine actual network hashrate being a constant value with fluctuations in the difficulty calculation due to the randomness of finding blocks. Every time the fluctuation happens to be negative (because too few blocks were found recently due to bad luck), diff drops by roughly 5%. The diff algorithm then typically will find blocks being generated too fast and so, over a few blocks, adjusts difficulty back up to the true network hashrate. But now if you calculate the mean "network hashrate" (i.e. computed as difficulty * 120) you would find that it is too low: there is a downward effect applied to some blocks without any corresponding symmetric upward effect applied to others. That is exactly what we see happening, and so we get a little under 2% too many blocks found per day.Now if you take that blockchain behaviour and add NiceHash, the fluctuations become a little worse: NiceHash bots jump in when diff is low, mine at artifically high profitability for a few blocks, then the diff is much too low and has to jump up to compensate. But when it comes back down again, the 5% adjustment is going to bring it down too fast, thus inducing the same cycle again. While NiceHash bots would do this sort of thing anyway, this 5% makes that drop faster than it needs to be and opens up larger opportunistic NiceHash windows than would occur without it.
TL;DR: the
atan
term added in the v9 diff adjustment makes difficulty fluctuate too much, makes the mining emission rate about 2% too high, and induces more opportunistic NiceHashing.Since the v9 change, at least as currently implemented, isn't buying the network anything except for encouraging more NiceHashing and less stable network difficulty, coupled with the fact that there is a theoretical (if very low probability) catastrophic failure, I'd like to suggest that graft ditches the
atan
(and the 1.05 idea) and instead, at the next fork, goes back to the v8 difficulty algorithm with just the FTL fix applied.Or, if you really, really want to keep it for some reason, at least replace it with a deterministic
atan
approximation, correct the domain of values passed into theatan
, and make it symmetric.The text was updated successfully, but these errors were encountered: