Pages

Friday 16 November 2012

10.2 Forecasting the network hashrate


0. Introduction
In the last post I showed in a simple way that changes in the MTGOX US$-BTC exchange rate correlated well with later changes in the mining Difficulty. Since there have been some misunderstandings on the subject I want to explain the correlation in a bit more detail before starting on forecasting the network hashrate.

Clearly, the exchange rate is not influencing mining difficulty directly, but through an intermediate process: 

US$-BTC  ->  cost per hash and earnings per hash  ->  network hashrate  ->  mining Difficulty

Of course, this assumes that there are no possible changes to the cost per hash or earnings per hash, or the network hashrate other than the MTGOX US$-BTC exchange rate. Some of the factors could be:

  • Technology changes that make the cost per hash lower or the earnings per hash higher, or both -  for example the change from CPU mining to GPU mining, to a lesser extent FPGAs, and to a much greater extent, the upcoming ASIC mining devices. This would likely increase the network hashrate without a corresponding increase in the MTGOX US$-BTC exchange rate. 
  • An influx of new miners due to publicity. If new bitcoin enthusiasts start to mine rather than buy, the network hashrate will increase without a corresponding increase in the MTGOX US$-BTC exchange rate.
  • Non-profit driven miners.  A given miner that mines at difficulties beyond which a return can be made could be mining irrationally - either due to the assumption that the local currency - BTC exchange rate will rise significantly, not wanting to turn off expensive equipment that has yet to pay for itself, or perhaps a less irrational reason such as wanting to contribute to network security. 
  • Changes to local electricity costs - price per kWh is unlikely to decrease, which means cost per hash is likely to increase, probably in the near term for many miners. This would, if a large enough increase, prevent the network hashrate from expanding beyond some maximum beyond which it would only be profitable for those not paying for electricity to mine,  without a corresponding decrease in the MTGOX US$-BTC exchange rate. 

The first has already occurred, but does not seem to have had an obvious effect on network hashrate that could not be predicted by the exchange rate. This may be because the cost of purchasing a GPU was sufficiently prohibitive that an increase in  MTGOX US$-BTC was required to make it possible. ASICs however are much cheaper to purchase (in terms of $/hash) and much cheaper to run (hash/joule) and should cause a significant discontinuity in the correlation.

The second possibility has certainly occurred at times. This however does not seem to have caused an increase in difficulty that could not be predicted by the exchange rate. 

I am uncertain as to the likelihood of the third possibility having occurred. It would be a good explanation for the distinct lack of the mining Difficulty's response to the MTGOX US$-BTC crash after June 2011. If correct, mining difficulty should reduce much more slowly than MTGOX US$-BTC.

The last is yet to occur, but will at some point in the not too distant future. After this point, the only increases in network hashrate not due to publicity or non-profit driven miners will be either incremental as technology gradually improves efficiency and cost, or the exchange rate.

My original intention for this post was to find out if any of the first three were had affected mining difficulty in any significant way, or at least find specific points at which there was a significant departure from the MTGOX US$-BTC correlation and determine a possible reason. 

Mining difficulty already lags the network hashrate since it is calculated using the average network hashrate for the previous difficulty period. It occurred to me that a more accurate model might be obtained if I attempted to find a correlation between the MTGOX US$-BTC  exchange rate and the network hashrate and, rather than with mining Difficulty. It also occurred to me that I should do this in a much more rigorous way than previously.

So I spent some time learning about ARIMA models, the autocorrelation function (ACF), the partial autocorrelation function (PACF), the cross correlation function and prewhitening just so I could learn to perform intervention analysis and determine if the network hashrate had departed from that determined by the exchange rate in significant way and at what date this started to occur, and to try to correlate that with known events.

I did not get quite that far, but I did discover some rather interesting things about the nature of the historical network hashrate, and how to make a short term forecast of it's weekly average.

1. Cross correlations between the exchange rate and the network hashrate.
Although I'm going to be quite light on detail in this section, if it doesn't interest you just skip to the next section where I explain how to predict the future. Just pretend it's magic, if you like. For the rest of you who may be more analytically minded, have a read through this online statistics course if you're not familiar with time series analysis. 

Note: In the following, "price" refers to the median MTGOX US$-BTC exchange rate rather than the volume weighted average price. I thought that the median price might be easier for readers to calculate for themselves. "Hashrate" refers to the network hashrate, averaged over the same seven days as the median price.

I assumed that using a daily estimate of hashrate and price would introduce too much noise which would complicate the analysis. Instead I opted for a weekly average which I hoped would be sufficiently long to avoid noise, and not so long as to miss interesting trends. I then transformed the data logarithmically and looked for autocorrelations. Both sets showed significant autocorrelations at various lags. This was simply background learning, though, and did not have much influence on how I proceeded except to confirm that the first differences of the data rather than the data itself would be necessary to use - since the average of the either dataset was not consistent. Note also that the variance was somewhat variable and an ARCH model may be more accurate.

Once I had a reasonable understanding of and a feel for ACFs and PACFs I proceeded to the cross correlation function (CCF). Although I did perform prewhitening on the data it turned out to be unnecessary - the CCF of diff(log(USDBTC),  n=1) and diff(log(hashrate),  n=1) was sufficient to determine likely lags:


2. Model 0

The most significant lags are at lag = 0, 1, 2, 3, 4. Using this information I found the best linear model with the fewest terms:
Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p))

where:
H = weekly average network hashrate
p = weekly median MTGOX US$BTC exchange rate

This model has an R squared of 0.9991, and p ~ 0. Charts of the comparison between the actual weekly average hashrate and the linear model, the residuals transformed to a percentage error of the actual network hashrate, and a histogram of said errors is below.


The model matches the actual network hashrate quite well, and the residuals are mostly within 10% of the network hashrate. The residuals are approximately normally distributed, and we can use this to provide a 95% confidence interval of +/- 15.1%.

3. Models 1 and 2.
At this point, you may be questioning the necessity of so many terms in linear model 0. I also created linear models using the same lags with fewer terms, of which the best two were:

Model 1: log(H) ~ 2.85  + 0.90lag1(log(H)) + 0.13lag1(log(p))

Model 2: log(H) ~ 0.75  + 0.98lag1(log(H))





The charts below are a comparison of errors of the three models.


Models 1 and 2 show errors spread more widely from 0% than Model 0, indicating a poorer overall fit. The extra terms in Model 0 do seem necessary to provide useful accuracy.

It is interesting to note that when log(H) is only a function of lag1(log(H)), the largest amount of error occurs just as the price increased significantly in mid 2011. This price change had such a significant effect on the network hashrate that it could not have been predicted by the previous week's hashrate with any degree of accuracy - but using lagged price terms made the error much less - another indicator that price indeed does have a significant impact on the network hashrate and so on mining difficulty.


4. Conclusions and discussion
Model 0 is quite a nice model for the network hashrate. What does it tell us? 

Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)), CI = +/- 15.1%

This means that this week's average hashrate can be approximated by a linear combination of last week's average hashrate, last week's median MTGOX US$-BTC price, and the weekly median MTGOX US$-BTC price from four weeks ago, within a 95% confidence interval of +/- 15.1% of the actual hashrate for this week. Although the dominant term is the hashrate term, accuracy degrades significantly if the price terms are not included.

The last post showed a simple correlation between the volume weighted average MTGOX US$-BTC price and the mining Difficulty with a lag of 3. Model 0 tells us a bit more about the nature of the network hashrate than this - model 0 implies that there is some degree of sluggishness about the network hashrate. One week's average tends to be very similar to the previous week average. We can only guess at the reasons for this, and my guess is that miners seem to be conservative. They tend to not switch off as soon as MTGOX US$-BTC price falls, and it takes time to decide to purchase more hashrate and bring it online when the MTGOX US$-BTC price increases.

Although only an incremental increase in our knowledge of how the hashrate responds to the MTGOX US$-BTC price, knowing that the average hashrate tends to move slowly will help miners make decisions about the future.

Model 0 can be restated as:
Next week's average hashrate can be approximated by a linear combination of this week's average hashrate, this week's median MTGOX US$-BTC price, and the weekly median MTGOX US$-BTC price from three weeks ago, within a 95% confidence interval of +/- 15.1% of the actual hashrate for next week.

So model 0 can forecast the network hashrate a week ahead. As a forecasting model, this might not be terribly useful - an educated guess might do as well. So lets look at why we might want to forecast the network hashrate:

  • Provide an accurate prediction of the network hashrate that miners can use to plan strategies eg for purchase of new equipment or changing to a cheaper electricity provider.
  • Determining when the model fails significantly, indicating a factor that is not accounted by the model.
So the next post or two will address long range forecasts of the network hashrate, and finding a less robust forecast model that will be much more sensitive to external changes - "a canary in the coal mine", if you will. I hope this sort of model can detect the effects of irrational mining or the influx of ASICS by significant failures to predict future network hashrates.

In summary:
  • Model 0: log(H) ~ 1.74 + 0.94lag1(log(H)) + 0.21lag1(log(p)) - 0.14lag4(log(p)) . This will be within +/- 15.1% of the actual network hashrate with 95% confidence.
  • All terms of the model are significant.
  • The model tells us that the network hashrate is generally conservative - one week's average will remains similar to the previous week's average, and will see positive or negative changes due to the weekly median MTGOX US$-BTC price from the previous week and three weeks before that.
  • Model 0 shows that changes in the network hashrate are generally attributable to price; however a volatile market has minor effect on the weekly average network hashrate.
  • A robust longer range forecast model and a more sensitive longer range forecast model will be address in the next post or two.





Thanks to forum members molecular (who apparently worships the FSM) and  Niko (who apparently enjoys mangosteen) whose comments on the "Price drives Difficulty" thread prompted me to look at the problem a little further. Also thanks to forum member bcpokey who is a better explainer and more eloquent than I.



Donations help give me the time to analyse bitcoin mining related issues and write these posts. If you enjoy or find them helpful, please consider a small bitcoin donation:
12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r



<tenpoint two>

No comments:

Post a Comment

Comments are switched off until the current spam storm ends.