Pages

Thursday 20 August 2015

August 16th 2015 Network Statistics


Changelog:
  • grid and ggplot2 packages both updated and broke scripts. Fixed a few and then just downgraded packages back to the last working versions - I'll have to get back on to that when I'm less lagged.
Errorlog:
  • Nil.

0. Back home and still a bit jet-lagged
I've been travelling for the past few weeks and most recently was in San Francisco, meeting a few friends, visiting bitcoin startups and attending the Coinbase office warming party (great space you have there, fellas). I'm still recovering, so I'm keeping this week's comments short and sweet, except for.....

1. Average block size drops suddenly
Bitcoin XT is released and then within a couple of weeks the average block sizes drop significantly. If I was a more conspiracy-minded person, I wouldn't be so sure that this is the coincidence it clearly is.

Still, it leaves the question of who or what was responsible for that sudden and short lived run of transactions. The average fee per block increased, as did fee per transaction, so it wasn't done cheaply.

Of course, it could be that for some reason we're now seeing a sudden and artificial drop in transactions. At this point I'm not sure which is more likely.


2. Network mining centralisation decreasing
With the top three block makers at similar hashrates and combining to only just over 50% of the network, we might expect centralisation to be reducing generally, and that seems to have been the case for the past few months - especially for the more bitcoin specific mining indices.

The only index not showing a long term reduction in inequality is the HI / gamma diversity which seem to be describing a point of inflexion towards a reduction in inequality only recently.



The second and third charts also include confidence intervals for the hashrate, the mean hashrate estimate, and a 28 day forecast estimate.
  • The dashed line is the mean hashrate estimate.
  • The grey shaded area is the 95% confidence interval for the mean hashrate estimate.
  • The dotted line is the 95% confidence interval for daily hashrate averages, given the mean hashrate estimate, so 95% of the large grey dots (average daily hashrate) should be within the dotted line.
  • The blue shaded areas are the confidence intervals for the forecast.
  • Forecast confidence intervals are bootstrapped.
You notice that the mean forecast is not given - just the confidence intervals. The reason for this is that in the past people have focussed on the mean forecast, but I think the range of values the network hashrate could take is much more important.








Miner profitability and forecast
  • The first plot below shows the weekly miner income and cumulative miner income for the past 52 weeks. 
  • The second plot shows the weekly miner income for the past 26 weeks with an eight week forecast.
  • The third plots shows the cumulative miner income eight week forecast.
  • Forecast confidence intervals are bootstrapped.

Again, the mean forecast is not given for the same reasons I gave previously. Eight weeks forecast is possible as these are weekly summary statistics; for daily summary statistics (such as above) only four weeks forecast is possible with any accuracy.





Transaction fees
Transaction fees are often overlooked by miners but will become very important for them - as the block reward decreases, transaction fees must necessarily go some way toward ameliorating the loss in block reward.

However, as can be seen in the top facet of the second plot below the transaction fees per block are not increasing - or even maintaining - a percentage of the block reward.

The lower facet plots the percentage of the maximum possible block size used.







Estimated mean and median miner hashrate
This estimate is actually the average and median percentage of the network contributed by each miner. 

Standard error has been calculated using bootstrapping resampled data, and is shown by the shaded area.





Estimated number of miners
The known number of miners is calculated using the miner hashrate distribution that some pools provide. It is shown by the dashed line, colour indicating the percentage of the network that those miners make.

The estimated number of miners uses a model to estimate the number of miners at pools that do not provide such data. I will be attempting to optimise the model regularly, so this week's plot may not be the same as last week's.

Standard error for the estimate has been calculated using bootstrapping resampled model data, and is shown by the shaded area.




Inequality measures
General inequality between block makers (facet 1)
Previously, I have described inequality measures. The two general inequality measures, the Gini coefficient and the Theil index, measure inequality between blocks block makers. They are minimised when all block makers solve a similar number of blocks over a period of time  and maximised if only one of many block makers solves all the blocks for a given period of time (since we know that bitcoin mining is a stochastic process in which variance can be significant, a reasonable time period should be chosen). 

The Herfindahl index theoretically captures the equivalent share that would be enjoyed by equal-sized firms in the marketplace.

Inequality between groups: smaller block makers and larger block makers (facet 2)
I'm using two ways to illustrate inequality between the half of the network with the highest concentration of hashrate, and the half of the network with the lowest concentration of hashrate.
Mining centralisation index = 1 - mean(Sblocks) / mean(Lblocks)
Sblocks = number of blocks solved by small block makers
Lblocks = number of blocks large by large block makers

This index is measuring the inequality between two groups: the half of the network with the highest concentration of hashrate, and the half of the network with the lowest concentration of hashrate. It can be interpreted as:

Large to small density ratio = 1 / (1 - centralisation index)

For example an index of 80% means that the average larger pool has 1 / (1 - 0.8) = 5 times greater proportion of the network than the average smaller pool.

Mining centralisation index 2 = Sh * (log(Sh) - log(Sn)) + Lh * (log(Lh) - log(Ln))

Sh = Sblocks/(Sblocks + Lblocks)
Sn = No. small pools/(No. small pools + No. large pools)
Lh = Lblocks/(Sblocks + Lblocks)
Ln = No. large pools/(No. small pools + No. large pools)

This also has a range from maximum equality at 0 to maximum inequality at 1, but does not have an intuitive meaning (except that lower is better). 

Below the two general and two grouped inequality measures have been plotted. The Gini coefficient and the Theil index are quite similar, and the Mining centralisation indices 1 and 2 also are quite similar. 



General inequality between block makers: Gamma diversity
The Gamma diversity with q = 2 is equal to the inverse of the Herfindahl Index, and in this case equals the equivalent number of competitive firms. 



Inequality between groups: Public mining pools and non pool block makers.
Another concern many people have is that public mining pools have a decreasing share of the network. Public mining pools are reliant on miners in order to make blocks and distribute rewards, and a pool with fewer miners has greater income variance.

This means that if a pool was doing something to the block chain that miners don't like (anything from incorporating graffiti into the block chain - some of my favourite graffiti here -  to Selfish Mining), miner could choose to leave the pool. Non pool block makers might have fewer restrictions on their actions, which could be a problem for the network.

There are a number of different ways to analyse this, but I went with something quite simple:

Public mining pools % network  = P  /  N
P = no. of blocks attributable to public mining pools in some period of time
N = no. of blocks solved by network in same period of time.

This is quite simple to understand. If you worry about mining pools disappearing, then the fact the line is slowly heading toward 50% won't help you sleep at night.




organofcorti.blogspot.com is a reader supported blog:

1QC2KE4GZ4SZ8AnpwVT483D2E97SLHTGCG



Created using R and various packages, especially dplyrdata.tableggplot2 and forecast.

Recommended reading:


Thank you to blocktrail.com for use of their address data, and coincadence.com for their p2pool miner data.

Find a typo or spelling error? Email me with the details at organofcorti@organofcorti.org and if you're the first to email me I'll pay you 0.01 btc per ten errors.

Please refer to the most recent blog post for current rates or rule changes.

I'm terrible at proofreading, so some of these posts may be worth quite a bit to the keen reader.
Exceptions:
  • Errors in text repeated across multiple posts: I will only pay for the most recent errors rather every single occurrence.
  • Errors in chart texts: Since I can't fix the chart texts (since I don't keep the data that generated them) I can't pay for them. Still, they would be nice to know about!
I write in British English.


No comments:

Post a Comment

Comments are switched off until the current spam storm ends.