28th January 2015
0. Introduction
I was getting centralisation estimators ready to be included in the weekly statistics posts, and I realised that there were a few more useful centralisation measures I hadn't considered previously. If you'd like a reminder of what was covered previously, here are the previous posts on the subject:
It should be noted that as before "Unknown" has been defined as a single block making entity, although it is almost certainly not the case.
1. Inequality between block makers
Previously, I discussed three useful centralisation or inequality measures. The two general inequality measures, the Gini coefficient and the Theil index, measure inequality between blocks block makers. They are minimised when all block makers solve a similar number of blocks over a period of time and maximised if only one of many block makers solves all the blocks for a given period of time (since we know that bitcoin mining is a stochastic process in which variance can be significant, a reasonable time period should be chosen).
2. Inequality between groups: smaller block makers compared to larger block makers
I also described an inequality measure which I called the "Bitcoin centralisation index" which I defined as, for a given time period:
Centralisation index = 1 - mean(Sblocks) / mean(Lblocks)
Sblocks = number of blocks solved by small block makers
Lblocks = number of blocks large by large block makers
This index is measuring the inequality between two groups: the half of the network with the highest concentration of hashrate, and the half of the network with the lowest concentration of hashrate. It can be interpreted as:
Large to small density ratio = 1 / (1 - centralisation index)
For example an index of 80% means that the average larger pool has 1 / (1 - 0.8) = 5 times greater proportion of the network than the average smaller pool.
Why not simply redefine the index to make it more intuitive? The index should be easily relatable to other indices describing similar inequalities. The Gini coefficient and the Theil index both range from maximum equality at 0 to maximum inequality at 1, so the centralisation index has been defined so that it does too.
Reading a bit more on entropy measures, I realised I could define another large/small block maker inequality index. In a fit of creativity, I renamed the first centralisation index "Mining centralisation index 1" and named the new centralisation index "Mining centralisation index 2"
Mining centralisation index 2 = Sh * (log(Sh) - log(Sn)) + Lh * (log(Lh) - log(Ln))
Sh = Sblocks/(Sblocks + Lblocks)
Sn = No. small pools/(No. small pools + No. large pools)
Lh = Lblocks/(Sblocks + Lblocks)
Ln = No. large pools/(No. small pools + No. large pools)
This also has a range from maximum equality at 0 to maximum inequality at 1, but does not have an intuitive meaning (except that lower is better). I think that it may be a better measure of entropy across the two groups than index 1.
Below the two general and two grouped inequality measures have been plotted. The Gini coefficient and the Theil index are quite similar, and the Mining centralisation indices 1 and 2 also are quite similar. I like the fact the index 1 has an intuitive meaning but index 2 seems to be more sensitive.
,
3. Inequality between groups: Public mining pools and non pool block makers.
Another concern many people have is that public mining pools have a decreasing share of the network. Public mining pools are reliant on miners in order to make blocks and distribute rewards, and a pool with fewer miners has greater income variance.
This means that if a pool was doing something to the block chain that miners don't like (anything from incorporating graffiti into the block chain - some of my favourite graffiti here - to Selfish Mining), miner could choose to leave the pool. Non pool block makers might have fewer restrictions on their actions, which could be a problem for the network.
There are a number of different ways to analyse this, but I went with something quite simple:
Public mining pools % network = P / N
P = no. of blocks attributable to public mining pools in some period of time
N = no. of blocks solved by network in same period of time.
This is quite simple to understand. If you worry about mining pools disappearing, then the fact the line is slowly heading toward 50% won't help you sleep at night.
4. Inequality within groups: Miners at public mining pools
In the same way that the Gini coefficient and the Theil index measure general inequality between block makers, they can also measure the same thing within public mining pools. In this case they measure the inequality between miners. These indices are much higher for miners than for mining pools, which means a high percentage of an arbitrary pool's hashrate is due to a few large miners. In fact, I have not been able to find any example of a sustained Gini coefficient of more than 95% in other systems. In countries notoriously unequal incomes it is rarely above 65% (Seychelles, 2007). Only educational inequality in the poorest countries comes close, at 92% (Mali, 1990). There are a large number of bitcoin miners with near (or less than) zero income, and a small few who actually make a living.
I think this also helps explain the instability of mining pool rankings. Hashrates at mining pools change much more than one might expect. For example, pools which previously touched the magical 50% of the network have either died or have less than ten percent of the network.
It only takes a few large miners leaving to disproportionately affect a pool, increasing variance. For pools with luck based reward methods, increased variance encourages smaller miners (who might not feel like they can manage variance) to move to a larger pool, or a pool with a luck-free reward method. Pools with a luck-free reward method would be less affected by the loss of a few large miners, but those pools face increased risk and can have a somewhat precarious position regardless.
5. Summary
Inequality (or centralisation) in bitcoin mining can be analysed four different ways:
- Inequality between blocks makers
- Gini coefficient and Theil index
- Inequality between blocks makers with the highest concentration of hashrate and those with the lowest concentration of hashrate.
- Mining centralisation indices 1 and 2
- Inequality between public mining pools and non-pool block makers.
- Public mining pool percentage of network.
- Inequality within mining pools (between miners)
- Gini coefficient and Theil index
You don't need to understand the reasons why the indices work, but of what they might be warning you.
- The first two groups of inequality measures suggest that right now the network is becoming less centralised, however the percentage of the network attributable to mining pools is steadily decreasing. Together, these measures suggest that the network hashrate is becoming more evenly spread between block makers (less centralised) but becoming more concentrated in the hands of private entities (possibly more centralised).
- The last measures indicate significant inequality between miners using mining pools. There are many miners with low hashrate devices and a few with very large hashrate devices which make up the bulk of a pool's miners. This means, for example that pool operators can expect a large variation in hashrate over time and need to be able to cope with users with very high hashrates. The inequality also means that mining equipment manufacturers can better decide how to market and cost their devices depending on which market segment they target.
organofcorti.blogspot.com is a reader supported blog:
1QC2KE4GZ4SZ8AnpwVT483D2E97SLHTGCG
Find a typo or spelling error? Email me with the details at organofcorti@organofcorti.org and if you're the first to email me I'll pay you 0.01 btc per ten errors.
Please refer to the most recent blog post for current rates or rule changes.
I'm terrible at proofreading, so some of these posts may be worth quite a bit to the keen reader.
Exceptions:
- Errors in text repeated across multiple posts: I will only pay for the most recent errors rather every single occurrence.
- Errors in chart texts: Since I can't fix the chart texts (since I don't keep the data that generated them) I can't pay for them. Still, they would be nice to know about!
I write in British English.
No comments:
Post a Comment
Comments are switched off until the current spam storm ends.