Pages

Friday, 3 May 2013

12.2 Pool and network miner hashrate distributions


3rd May 2013

0. Introduction

Suggested reading: 
12.1 The pre-ASIC network hashrate distribution

This post has been a long time coming and was originally titled "12.2  Predicting post-ASIC network hashrates", and this post is not that post. I apologise if I got your hopes up on that score, but you're probably not as disappointed as I am.

 I had written a follow up to post 12.1, "12.2  Predicting post-ASIC network hashrates" at the same time as I'd written 12.1, but I planned to set it a week because I wasn't sure of some of my methods and assumptions and want to work on them further. I found some mistakes in my method, reworked the analysis and rewrote the post. Unfortunately this process was like pulling a thread, and the more I reworked the post, the more I found to rework. After reanalysing and rewriting five times, I admit defeat.

I did learn some interesting things about truncated distributions and found a very good truncated Pareto II fit for the pre-ASIC user account hashrate distribution, but my idea of using this distribution to predict the post-ASIC network hashrate was flawed in several ways.

  1. How should the pre-ASIC user hashrate distribution be related to the post-ASIC user hashrate distribution? The same distribution, but ending at 262Thps (ASICMiner's hashrate aim)? Or a linear transformation of the pre-ASIC data to the new hashrate range? Or compare average or median hashrates to the network hashrate?
  2. Will ASICMiner surpass 262Thps and maintain 10% of the network (as some have suggested)? (in this case, modelling the network hashrate became a problem I couldn't solve).
  3. When one or two actors (ASICMiner and PicoStocks) control most of the network, is it really possible to make a predictive model?
  4. A GPU or an FPGA provide hashrates in many different increments, and are quite modular. ASICs are, at the moment, only available in large chunks of hashrate. 
  5. Upgrading to an ASIC is quite different from buying a new GPU or another FPGA. It is not a matter of purchasing another modular section, but, in time, upgrading all GPU or FPGA mining tools to ASIC mining tools. The cost barrier may lead some to upgrade and others to mine alt coins.
  6. The exchange rate can't be modelled, but may have significant effects on the network hashrate from the effects of points four and five.
Point 3 is quite an interesting one. Why do the user account hashrate distributions in 12.1 The pre-ASIC network hashrate distribution appear Pareto distributed? My guess is that, during the pre-ASIC era, the relationship between the price of a mining device and the hashes it can produce was fairly narrowly distributed. Therefore a user account's hashrate is related to the amount of money they could afford to invest. Since the Pareto distribution can describe the distribution of wealth in a society, I assume that here it represents the distribution of wealth for bitcoin miners.

If only the network hashrate is determined in a large part by only a couple of actors, then these actors alone have the ability to skew the data. When a few individuals can make arbitrary choices that then determine modelling parameters, a predictive model - which assumes that en masse people will act a certain way - is simply not viable.

Point 4 also bears further thought.  The parameters of the distribution may also be affected by the technology "trickle down" from large hashrate actors. In order to assess possible user account hashrate distributions in the future, I needed to make some assumptions about the nature of the distribution. These assumptions may not be valid for the post-ASIC network, since the entry cost of the devices are currently sufficiently high to deter the casual miner. We might expect a much more widely distributed, less "top heavy" distribution with fewer miners, or alternatively once the entry cost is reduced (which seems likely, given the small 300 Mhps mining tools that ASICMiner have announced) the distribution may be much more "top heavy" with lots of miners but a few with a significant portion of the network hashrate.


Point 6 adds a little to point four and five. Neglecting the probable effects of the MTGOX price on the network means ignoring a significant variable, since as we know the network hashrate correlates with previous hashrates and previous MTGOX exchange rates - and I can't estimate the path the network will take to get to the next plateau.

The problem with this is that, as already mentioned, I think that the relationship between GPU / FPGA prices and hashrates leads to the pre-ASIC network hashrate distribution. I have no idea what the price relationships will be - pricing is extremely fluid and the exchange rate volatile. Some devices are priced in terms of the amount of btc that can be earned in a period of time, other devices are auctioned for sums that only the terminally optimistic could imagine will be recouped in any reasonable amount of time.

The exchange rate, effects of the future partial decoupling of ASIC prices from fiat currency and the ASIC sellers market that currently exists, future difficulty levels, electricity prices ... none of these are predictable.


There were a number of less interesting problems as well (less interesting to read about, anyway - related to deriving an unknown parameter of a truncated Pareto II distribution if all others are known, for example). In the end, the large number of assumptions I needed, a wild guess that had to be made, and my lack of confidence in the derivation method itself means the project is shelved until I can learn a better way to predict the post-ASIC network hashrate. Or until the post-ASIC network hashrate is here, whichever comes first.

Instead, I thought it might be interesting to map over time the changes in the network hashrate and the user account hashrate distribution. I hope to make this another weekly post, and will be building on the data it contains.


1. Pool user account hashrate empirical cumulative distribution function

This set of plots illustrates the changes in various pools' eCDFs over time. I collected data in January, and since then have begun to collect more data from the start of April onward.

The eCDF describes the number of variates at a given value of variate. We assume this describes some sort of underlying probability distribution function.

Since the pre-ASIC date of 27th January 2013 BTCGuild, Ozcoin, p2Pool and Polmine have a more rapidly increasing eCDF, indicating fewer low hashrate miners. Eligius on the other hand shows a clear increase in the number of low hashrate miners, and the remainder of pools have not experienced a significant change proportions of lower and higher hashrate user accounts.


2. The density of user accounts at various hashrates, and the density of user account hashrates, by pool.

The next set of charts describe:

  1. User account density: the number of users accounts contributing a particular hashrate to a pool. Scale is proportional to the number of user accounts.
  2. User hashrate density: the amount of hashrate contributed to the pool by a particular amount of hashrate. Scale is proportional to the total pool hashrate

In these charts the scale is not comparable by pool. So, for example, the user account density of the Bitclockers' violin should not be read to indicate that Bitclockers had the same number of users as BTCGuild on 2013-01-27, or for the user hashrate density plot that bitclockers had a greater hashrate than BTCGuild on the same date. However the difference between the scale of the pre-ASIC BTCGuild violin and the April violins does indicate a significant change in hashrate.

I had to break the plots up a little - Blogger doesn't seem to like excessively large images.

There are a few noteworthy observations:

  1. User account density. You'll notice most pools have a little "bubble" or even just a slight thickening just below 100Ghps. This equates to 60 - 70 Ghps, and I assume indicates single Avalon ASICs mining at the pool. You'll also see that the numbers of users at each pool hasn't changed significantly since January 27th.
  2. User hashrate density. It's quite plain that nearly all pools now have the majority of their hashrate provided by users with high hashrates. Although there's a bulge around 1Ghps as previously, the greatest portion is provided by much higher hashrates. The increased hashrate densities, or "hips" and decreased hashrate densities, or "necks" show small numbers of miners with massive hashrates. On the BTCGuild violins for example, the effect of ASICMiners 7 to 8 Thps is plainly visible as the topmost hip. The next lower hip is at about 1 Thps and the a much longer hip from just below to just above 100 Ghps - probably those with one or two 65 Ghps Avalon mining tools. In fact on the 21st and 28th April, a small neck forms, partially separating the two groups. Most other pools have the same tendency apart from Polmine, who still has most of its hashrate in the 2 to 10 Ghps range, and only has a comparatively small increase in total pool hashrate. 













3. Combined user account hashrate empirical cumulative distribution function


These plots combine data from all pools except 50BTC.com and Bitclockers - these had such unusual hashrate distributions compared to other pools I had to classify them as outliers or recording errors. The interesting thing about these eCDFs is how uninteresting they are - although changes in the pool eCDFs are clear, changes to the combined eCDF (and by extension, the network) is less obvious. If I plotted the eCDFs over each other the differences would be more apparent, but even then they overlay each other fairly closely. The only clear difference is how far the upper tail extends, as ASICMiner increases, decreases and increases its hashrate again.

The sharper eyed amongst you will also notice a little upward discontinuity just before 100 Ghps in the last three plots. This is not a plotting artifact, but actually represents a small but sudden increase in the number of miners at that hashrate. The reason? Most likely 65 Ghps Avalon ASIC mining tools.



4. The density of user accounts at various hashrates, and the density of user account hashrates, all pools combined.

Note that the scale of the following plots is unrelated to hashrate of number of miners, so scale comparisons cannot be made. This is done on purpose as I hope to add more pools over time, and adding a pool would make the combined hashrate or combined number of users increase. Thus scale is not comparable across time, so I've removed it altogether - however the shape of the densities can be compared. 
  1. User account density.  Most miners still have sub 1Ghps mining tools, and the little Avalon bubble is quite clear. The upper extension of the violin mimics that of BTCGuild and of couse is due to ASICMiner. Their drop in hashrate during 21st April is quite clear. The lower extension of the violin becomes much thinner, although this is mostly due to Slush's pool not being included in the combined data, as I can't update it.
  2. User hashrate density. An advantage of using an invariant scale here is that the shapes of the densities can be more easily compared. Pre-ASIC, hashrates were continuously distributed from small to large, leading to a very smooth and rounded density plot. Post-ASIC, the distribution becomes quite lumpy - since Avalon ASICs were the only ones available up until this point, large amounts of hashrate exist in the "hips", at integer multiples of 65 Ghps or thereabouts.




5. Conclusions.


  1. Most miners still have sub 1Ghps mining tools.
  2. A visibly large number of user accounts are at ~65Ghps, likely as a result of using an Avalon ASIC mining tool.
  3. Pre-ASIC, hashrates were continuously distributed from small to large, leading to a very smooth and rounded density plot. Post-ASIC, the distribution becomes quite lumpy - since Avalon ASICs were the only ones available up until this point, large amounts of hashrate exist in the "hips", at integer multiples of 65 Ghps or thereabouts.
  4. The distribution of users accounts and user account hashrates is very much in flux, and given the combined distributions "lumpiness" it's unlikely that a Pareto II distribution will model the current user account hashrate distribution very well.






6. To do

  1. Add an automated fitting (and test of fit) of a truncated Pareto II distribution to the combined user hashrate at each date.
  2. Add table of data - total, median and mean hashrates per pool and combined, users accounts per pool, and estimated user accounts across the network.
  3. If you can think of anything else interesting to include, please post in the comments and I'll do my best to add your suggestion.





organofcorti.blogspot.com is a reader supported blog

BTC:  12QxPHEuxDrs7mCyGSx1iVSozTwtquDB3r
LTC:  LPXnETNoCBr16GduvyWRzFP83rZNeEgMuB


No comments:

Post a Comment

Comments are switched off until the current spam storm ends.