Pages

Monday, 16 March 2015

March 15th 2015 Mining Pool Statistics



Changelog:
  • Nil.

Errorlog:
  •  Slush's block data is no longer scrapable, so I now need to use their JSON feed. Unfortunately that doesn't include txid or blockhash, so I can't independently check the validity of each block. Luckily, Slush's team are very good at what they do and have helped me lots in the past (eg setting up an anonymised hashrate distribution feed) so I have high hopes of getting Slush's stats back up and running by next week.
  • Triplemining solved a block after 888 hours. Unfortunately, the pool has been unable to calculate exactly how many difficulty-1 equivalent shares were submitted before the block solved, so I can't calculate hashrate or luck estimates.
  • Kano's pool stopped updating block data after 9th March.

Notifications:
  • Slush's, Triplemining and Kano.is pool data will be incorrect or missing (see above).


0. This is what I dislike about monitoring mining pools
Even the best pools have downtimes or change data formats without notice, which makes monitoring pools difficult and time consuming. I don't want to spend my Sunday evenings trying to bug fix, figuring out which pool's data is wrong and why. 

What is needed? Standardised datasets, so pools can be compared to each other.

This doesn't have to be complicated; in fact the simpler the better. The easier it is for anyone to access the data, the more analysts will be able to work with the data, and the safer you (as miners) will be. So for block data I'd suggest a simple .csv output of:

height, unixtime start, unixtime end, difficulty 1 shares, blockhash

That's all that is required - all other data (number of confirmations, whether it is orphaned, the pool hashrate, etc) can be derived from this data or from the blockchain (eg network mining difficulty).

For average user hashrate I'd use a six hour averaging window, and report it as a simple column of hashrates, eg:

User ghps
3175.1643338
247.6406560
2.2142943
20.1469758
0.1049881
....
....


I'd like to invite mining pools and miners to open dialogs on bitcointalk.org or reddit and try to come to an agreement. Transparency is very important in this industry, but there is significant difficulty involved in writing and updating scripts for every pool. If only a few people are monitoring pools because of the time and difficulty involved in doing so, I have no doubt some problem at some pool will be missed and some of you will lose a large amount of coin because of it.

In the past I've posted requests for standardised data APIs and methods, but nothing came of it - see if you (as a miner or a pool op) can do better.



Edit: 
  • The pool op for Kano.is helped me figure out the problem - I wasn't getting updates because my script silently failed if a field in the dataset was empty. This is default behaviour, and since I wasn't expecting empty field I left it as such. Now fixed.
  • Pavel, my contact at Slush's pool, added the block hash field as soon as he got my email, so that will be ready for next week too.

 


User account hashrate distributions still needed
Now that you can see more of the interesting analyses I can perform if I have this data, you might be even more keen to encourage your pool to provide a "Hall of Fame" feed. To reiterate, I need user account hashrates in order to estimate a number of different network statistics, and to do that I need user account hashrates averaged over at least an hour, preferably several hours or more. The data can be anonymised - it's just the user hashrates I need.


Explanation of the tables and charts

Pool reported block history statistics. This table lists all statistics that can be derived from the number of blocks a hashrate contributor has solved for the past week using all solved blocks - both valid and orphaned - and difficulty 1 shares per round. 
  • A much more accurate estimate of the hashrate, confidence intervals are unnecessary.
  • Orphan races lost, and percentage of  solved blocks that were not added to the blockchain.
  • "Luck" is the usual difficulty 1 equivalent shares per round / mining difficulty,  or (equivalently) accepted shares / expected shares.
  • CDF: The cumulative density function (CDF) measures the percentage of the time this number accepted shares / expected shares would be less than the calculated value, given the number of valid + invalid blocks.
  • Pool profitability uses compares variables such as total number of shares in a week and total reward (including transactions) in a week with the expected reward per share. Pool fee is not included.
Since BTC Guild doesn't report shares per block but does report transaction hashes for all blocks, luck cannot be calculated but orphaned blocks can be enumerated. Pools that don't have a public pool interface cannot be included.



Average hashrate per solved block (valid + invalid)
Hashrates are calculated from the pool reported difficulty-1 equivalent shares per round and the pool reported block solve times for all solved blocks, both valid and invalid. Note that BTC Guild is not included since the difficulty 1 equivalent shares per round data is not reported; instead use BTC Guild's hashrate chart which has matched my past estimates quite well and which I regard as accurate.


Pool profitability
This simple pool profitability uses compares variables such as total number of shares in a week and total reward (including transactions) in a week with the expected reward per share. Pool fee is not included, but this is a good basis on which to compare pools, as long as you're aware of pool fees, whether transaction fees are paid to you, and whether or not the pool is paying for orphaned blocks.

Obviously only relevant if the reward method is not PPS; however the charts can also be interpreted as the profitability of the pool, so it might give you some insight into the financial health of a PPS pool.



Density of orphaned blocks
This chart shows the density of orphaned blocks per pool, as a function of blocks solved by that pool. The fringe indicates the actual occurrences of the orphaned blocks, and the colour of the line and fringe indicate the approximate date.

Some orphan data may be missing from Polmine. The rest seem to be correct.





Pool user hashrate and combined user hashrate densities
The top facet of this chart shows the proportion of user accounts with a given hashrate - the thicker the "violin" the greater the density of user accounts with a particular hashrate.

The bottom facet is the same data, weighted by hashrate. In effect, it shows what proportion of the pool's hashrate is supplied by particular hashrates. The area of the "violins" is proportional to their total hashrate.

Note that for some pools the hashrate is averaged over twenty four hours, some pools are averaged over an hour or more and some for only fifteen minutes, so expect some variance in the results.


Gini coefficient for miner hashrates
The Gini coefficient measures inequality, as is further discussed here. In the plot below, it is measuring inequality within public mining pools. The lower the coefficient, the less inequality in the pool.

These indices are much higher than most things for which the Gini coefficient is calculated. For example in countries notoriously unequal incomes it is rarely above 65% (Seychelles, 2007), and educational inequality in the poorest countries comes close, at 92% (Mali, 1990). There are a large number of bitcoin miners with near (or less than) zero income, and a small few who actually make a living.

"All combined" is the Gini coefficient for miners at all pools that can be monitored, combined. It is more affected by larger pools and less affected by smaller pools.











organofcorti.blogspot.com is a reader supported blog:

1QC2KE4GZ4SZ8AnpwVT483D2E97SLHTGCG



Created using R and various packages, especially dplyrdata.tableggplot2 and forecast.

Thank you to blockchain.info and coinometrics.com for use of their transaction and address data, and coincadence.com for their p2pool miner data.

Find a typo or spelling error? Email me with the details at organofcorti@organofcorti.org and if you're the first to email me I'll pay you 0.01 btc per ten errors.

Please refer to the most recent blog post for current rates or rule changes.

I'm terrible at proofreading, so some of these posts may be worth quite a bit to the keen reader.
Exceptions:
  • Errors in text repeated across multiple posts: I will only pay for the most recent errors rather every single occurrence.
  • Errors in chart texts: Since I can't fix the chart texts (since I don't keep the data that generated them) I can't pay for them. Still, they would be nice to know about!
I write in British English.





`







4 comments:

  1. Dear Organ of Corti,

    for many months, mining pools have consistently had a "bad luck" according to your graphs.

    Have you looked into the reason for this? It cannot be explained only by stale blocks, since these only account for 1-2% per week.

    Thank you!

    ReplyDelete
  2. I'm not seeing that across all pools, just a few and we're looking into it as time permits.

    ReplyDelete
  3. Thanks for looking into it! Very appreciated.

    What I reacted on was that the total luck for all pools you look at has been 1.1-1.2. Pool by pool there is a statistical variance as expected, so they have both bad luck weeks and good luck weeks. As you say it cannot be seen across all pools, but the total at the bottom of the graph should be close to 1.0, right?

    Again, thank you!

    ReplyDelete
  4. What you say is true, but the total is more affected by larger pools. So if only one or two large or medium pools are very unlucky, they will have a significant effect on the total.

    ReplyDelete

Comments are switched off until the current spam storm ends.