"Price drives difficulty". We've said it or seen it said. That the btc to fiat exchange rate seems to be a good indicator of subsequent difficulty changes is an idea many have mentioned but few have analysed. I've seen a post that presents an easy to follow analysis, and with the interest in assessing difficulty changes after the initial batch of ASICs have been added to the network hashrate I thought now would be a good time to present a simple investigation of the assumed correlation between the US$ / BTC (USDBTC) rate and mining Difficulty (D).
Note: Although the following post is interesting and quite possibly useful, keep in mind that block reward halving and the addition of significant hashrate to the network (and the resultant difficulty) means that the correlations I discuss below will most likely not be valid for long.
1. Correlations
A correlation is a measure of the similarity of "shape" between two dependant variables sharing an independent variable. It is not influenced by y-offset or scale and so is a good initial assessment of a relationship between two dependant variables, after which a linear model then be applied.
If you want to follow along at home I've created a dataset containing the dates (in unixtime) each difficulty period started, along with the USDBTC close, median and volume weighted average price for that difficulty period.
USDBTC and D do seem correlated. The chart below shows a visual comparison indicating some similarity in shape between the USDBTC close price for a difficulty period and D for the same difficulty period. The y axis is left blank on purpose since it is a correlation we are interested at the moment, not the actual variable values.
The correlation coefficient between the two is only 0.549, not a very good correlation (1 is a perfect correlation). How can we obtain a better correlation?
- "Price drives difficulty" implies a lag between price and Difficulty changes.
- It is reasonable to assume that the relationship is not linear - that is an increase in "USDBTC close" does not imply a D increases by the same amount multiplied by a scale. Instead it is more likely that a certain percentage increase in "USDBTC close" results in a similar percentage increase in D. If this is the case, log(USDBTC close) is more likely to correlate well with log(D) than "USDBTC close" with D.
- The closing value of USDBTC may not be a good indicator of the USDBTC values for the difficulty period. Instead the median USDBTC or the volume weighted average price, vwap [ sum(price x volume)/sum(volume) ] may be better indicators.
After investigating these possibilities, I found the best correlated variables to be log(USDBTC vwap) and log(D) with a lag of two difficulty periods. This results in a correlation coefficient of 0.914 - a much better correlation.
Now that we have defined our best correlated variables, we can model a linear relationship between them:
log(D, lag = 2) = log(vwap)*a + b
where
a = slope
b = intercept
The best model resulted in a slope of 1.16039 and an intercept of 11.96719 (both with p ~ 0):
Model 1:
log(D, lag = 2) = log(vwap)* 1.16039 + 11.96719
D = exp(log(vwap, lead = 2)* 1.16039 + 11.96719)
Model 1 is not a very good predictor of difficulty. However we can make some real world assumptions (which may or may not be valid) that might help:
- The ability of miners to purchase or otherwise increase hashpower may have been exhausted for the two peaks in USDBTC vwap.
- Miners take time to make decisions about purchasing more hashpower, so a volatile market may have less of an effect on miner decisions. Miners may also be more loathe to switch devices off if they have been purchased for the purpose of mining. The first large trough and following peak may be have had little effect on difficulty due to this "capacitance" effect.
The upshot of this is that big peaks and troughs in the BTCUSD vwap will probably not move D in the same way as a steady change - thus the log model is probably non-linear. We can model this by adding a polynomial term. I used a log(BTCUSD vwap)^3 as an odd exponent results in the same sign, and I thought log(BTCUSD vwap)^5 would have too great an effect.
With a little tinkering I came up with the following model, the correlation coefficient of which is 0.96:
Model 2:
log(D, lag = 2) = 1.61698 * log(vwap) - 0.0774697*log(vwap)^3 + 12.0
D = exp(1.61698 * log(vwap) - 0.0774697*log(vwap)^3 + 12.0)
2. Conclusions
- "Price" does indeed "drive difficulty", and an analysis of correlation coefficients tells us that the "price" is the log of the volume weighted average USDBTC price from two difficulty periods previously, and difficulty is actually the log of D.
- More simply, log(D, lag=2) = log(BTCUSD vwap)*a + b
- A simple linear model shows a clear relationship between the two, but has no significant predictive value.
- A more complex model is slightly better at predicting D, but is still not useful as a lone accurate predictor of D, and also suffers from the possibility of "overfitting".
- It is likely that other contributors to D make a simple D predictor based on price impossible. However a simple predictor as developed above maybe useful in predicting a general direction in the change of difficulty and the strength of the change (barring a very volatile market).
- None of this will be valid in approximately three weeks, when after block number 210000 the block reward halves to 25 btc, and be even less valid with the onset of ASIC hashpower. However the idea behind the analysis stands and may be used once the dust settles.
Donations help give me the time to analyse bitcoin mining related issues and write these posts. If you enjoy or find them helpful, please consider a small bitcoin donation:
12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
<tenpoint one>
No comments:
Post a Comment
Comments are switched off until the current spam storm ends.