Other weekly pool and network statistics posts
Welcome, miners.
Changelog:
- Everything - see below
- I haven't rebuilt this part yet - TBA.
- I haven't found any - please tell me if you find something.
0. Update almost complete
The reasons behind working on a new method of gathering and analysing network hashrate contribution statistics for the last few months:
- Orphaned blocks were often incorrectly reported by pools.
- There is a wide variety of block history reporting methods, some of which don't report block heights, and some of which don't report block creation times, some report block hashes, and some report only the transaction hash of the generation address transaction.
- I had no simple way to integrate new hashrate sources.
- I had no way to attribute blocks to generation addresses.
This means that I have an accurate record of orphaned blocks, but it also means that I can use the blockchain height and timestamp recorded on the blockchain to fill in the data that some pools do not report.
Only orphaned blocks can't be assigned a timestamp in this way, and for pools which do not report the creation time of an orphaned block I use the timestamp of the block that won the orphan race. For pools reporting neither block height nor time of orphaned blocks, I use the next time the pools solves a block. In these latter two cases some inaccuracy may occur.
Several important notes:
- There are several pools with missing historical data or are missing from historical data completely. This only affects the extent of the historical data charts (for example Figure 2) but not the accuracy - except for some early values of top three pool's combined hashrates.
- I have requested this data from several pools but have not yet heard back, except from BTC Guild which doesn't have the data available anymore, and BTC Mine which will deliver their complete history soon.
- I won't be including pool-hopping charts unless pool-hopping becomes common once more.
1. Weekly statistics using blockchain data
Using only blockchain data, the following statistics can be reported:
- An estimate of the hashrate (with the upper and lower 95% confidence interval bounds).
- Valid blocks solved.
- The percentage of network blocks solved.
2. Weekly statistics using pool reported data
The weekly statistics that can be calculated using all solved blocks - both valid and orphaned - and difficulty 1 shares per round are:
- A much more accurate estimate of the hashrate, confidence intervals are unnecessary.
- Orphan races lost, and percentage of solved blocks that were not added to the blockchain.
- "Luck" is the usual difficulty 1 equivalent shares per round / mining difficulty, or (equivalently) accepted shares / expected shares.
- CDF: The cumulative density function (CDF) measures the percentage of the time this number accepted shares / expected shares would be less than the calculated value, given the number of valid + invalid blocks.
- Bitcoin per Gigashare. This figure is not an indicator of how much a miner should have expected per one million Difficulty 1 shares (or one thousand difficulty 1000 shares, etc), since it doesn't take into account the reward method or fees charged. Rather, it should be considered as a "luck" index that also incorporates the number of orphaned blocks and the current reward per block.
3. Reused but unknown generation addresses
Unknown generation addresses that are not reused are probably solominers or private mining concerns that don't have share-holders wanting to follow transactions. However, reused addresses are probably from hash contributors that do not wish to remain anonymous. These need to be identified so they can be removed from the "Unknown" group. I'm not interested in identifying those who wish to remain completely anonymous, so I'm not trying to trace originating IP addresses (as Blockchain.info does).
Unknown recurring generation address | Blocks solved this week | Percentage of network | Percentage of unknown | Estimate of hashrate | Blocks solved ever |
---|---|---|---|---|---|
1GuMujABuc8kvzDTyVJFpcf4vszPUgsjiU | 9 | 0.72 % | 10.98 % | 114 Thps | 17 |
18mhP1YWE2cheiGezjpbRqY29wFLdGWtN4 | 7 | 0.56 % | 8.54 % | 89 Thps | 7 |
136eiekKeXJUTbbwj6NU4vP7AnVX5RjUx2 | 5 | 0.40 % | 6.10 % | 64 Thps | 5 |
17SptDvV9jovDMXQPR7PVvNDcjnxuaf4ga | 3 | 0.24 % | 3.66 % | 38 Thps | 6 |
16d11Mgzwvge21tZxaq6pwXnbrPDzbtVNH | 2 | 0.16 % | 2.44 % | 25 Thps | 2 |
1G9QNTLvP3PbhDf6r9AhtwYQBxbt8rnnu8 | 2 | 0.16 % | 2.44 % | 25 Thps | 2 |
1NcYzANW5z1WF1ssnEQG3FpxBhAwettRxw | 2 | 0.16 % | 2.44 % | 31 Thps | 2 |
4. Percentage of network blocks
The only change here is that I'm not bothering with percentage of the network hashrate, which was inaccurate due to an increase in the number of hashrate sources with estimated hashrates.
5. 51% attack chart
There are several changes here compared to the previous version: hashrates are all estimated from blocks solved, and the history goes back to the earliest date my data contains three known pools. As previously mentioned, some pool data may be missing from the earliest data points.
6. Network percentage history of the current top ten contributors.
Data is calculated from the number of blocks each contributor added to the blockchain during the week, rather than using hashrates and estimated hashrates, and because of this should be more accurate with a longer history. The other changes are cosmetic.
7. Average hashrate per solved block (valid + invalid)
The only changes here are cosmetic:
- Changes to facet labels and background.
- The last two weeks are included rather than just the last week.
- BTC Guild not included. This is because I've simplified data that I scrape and collect to improve robustness of the data collection. However, BTC Guild has a very good hashrate chart which has matched my past estimates quite well and which I regard as accurate.
- The network hashrate per 144 blocks is also not included, since I think it is of limited use to readers more interested in pools.
8. Pool "luck"
I have wanted to remove either the shares per round / network Difficulty chart or the CDF chart, since they measure the same thing. However I couldn't make up my mind as to which would be the better chart to keep. In the end I came up with something I think is better than both.
This chart is new and might look a little confusing at first, but is actually simple and intuitive to follow once you get the hang of it. The orange dots are the usual accepted shares / expected shares (equivalently, shares per round / network Difficulty). The background colours are accepted shares / expected shares confidence intervals for the number of blocks solved for the week. The greater the number of blocks solved (the higher the percentage of the network) the narrower the bounds.
The "luck" data points should be outside the upper or lower boundaries only rarely. Many data points outside this range indicate unusual and unlikely "luck".
Data only goes back for the last twelve months at most - any more data points than this becomes hard to read, and recent data is most important.
9. User hashrate and combined user hashrate densities
Cosmetic changes only.
10. Comments?
I hope the new and improved charts and tables are useful for you. Let me know what you think about the changes - are they helpful? Any requests? Also, please let me know if there are any obvious formatting or spelling errors in the new charts.
Cheers all!
organofcorti.blogspot.com is a reader supported blog:
1QC2KE4GZ4SZ8AnpwVT483D2E97SLHTGCG
Find a typo or spelling error? Email me with the details at organofcorti@organofcorti.org and if you're the first to email me I'll pay you per ten errors:
Please refer to the most recent blog post for current rates or rule changes.
I'm terrible at proofreading, so some of these posts may be worth quite a bit to the keen reader.
Exceptions:
- Errors in text repeated across multiple posts: I will only pay for the most recent errors rather every single occurrence.
- Errors in chart texts: Since I can't fix the chart texts (since I don't keep the data that generated them) I can't pay for them. Still, they would be nice to know about!
<weeklypoolstatistics>
Would you please explain to me why is there such a large discrepancy in the presented pool hashrates? Taking Polmine as an example, the blockchain data statistics show it as 0.3, 12.7, 47.0. Then in pool reported data, we have 6.1, but when visiting Polmine.pl, I see they report their own hashrate at around 45.6.
ReplyDeleteI could say your Luck calculations are more trustworthy than mine, since you base yours on the shares per round. Mine have always been simpler: hashrate/blocos, the lower result the better. But mine pretty much depend on reported hashrates.
Do you see a possibility that any of these pools might be lying about their hashrates? Also, please point out to me if any of these questions I asked are based on false assumptions from my part!
Thank you for your attention!
The website hashrate is the pool hashrate averaged over a short period of time. The hashrates I report are calculated using either the number of solved blocks in the week (table 1) or the number of difficulty 1 equivalent shares submitted during the week (table2). So the website hashrate should not necessarily be the same as the weekly average.
DeleteHowever, I have noticed some previous data problems for polmine, and their hashrate per round is changes in an odd fashion, so a problem could be there.
"Hashrate/blocks" while a possible indicator of luck, can't be used over time (since the network difficulty changes) and don't have a simple CDF I can use.
It's possible pools could lie about hashrates, but I think there's little benefit in doing so since it's easy to check (a harmonic average of pool reported hashrates) so I think if there's a significant discrepancy I think it will be more likely an error in data reporting rather than a decision to cheat on reporting the hashrate.
Thanks for some interesting questions, Therese.
Thanks for your answer, organofcorti!
DeleteSo that means the tables from now on will be more trustworthy than the previous weeks, right (the 2nd table)? Also, I never really paid attention to the "Mean shares per round / D" from the previous tables, that is why I came up with the hashrate/blocks calculations, which leave some pools unusually with better luck than others. I'll do a comparison using the "Mean shares per round / D" later, and see if the results change much.
The tables won't be more trustworthy - the previous table just didn't show the same amount of data and it confused some readers. Now the data is split into statistics from solved blocks and statistics from shares per block, with separate explanations (for example: http://organofcorti.blogspot.com/2014/02/february-2nd-2014-weekly-hashrate.html).
Delete"Mean shares per round / D" or "submitted shares / expected shares" will be easier to follow over time. If you want to track luck for a particular pool or over several pools, that the way to do it.
OK, thanks so much!
DeleteBlack Arrow Prospero x-3 delay: http://my-dog-jetta.blogspot.com/2014/01/black-arrow-prospero-x-3-delay.html
ReplyDelete