I have four hexbin plots which have all been normalized. How do I add them together to make one big distribution?
I have tried concatenating the input vectors and then creating the hexbin plot, but this throws off the normalization of the individual distributions:
So how do I add the individual hexbin distributions whilst still maintainging the induvidual normalization?
The relevant part of my code is:
def hex_plot(x,y,max_v):
bounds = [0,max_v*m.exp(-(3**2)/2),max_v*m.exp(-2),max_v*m.exp(-0.5),max_v] # The sigma bounds
norm = mpl.colors.BoundaryNorm(bounds, ncolors=4)
hex_ = plt.hexbin(x, y, C=None, gridsize=gridsize,reduce_C_function=np.mean,cmap=cmap,mincnt=1,norm=norm)
print "Hex plot max: ",hex_.norm.vmax
return hex_
gridsize=50
cmap = mpl.colors.ListedColormap(['grey','#6A92D4','#1049A9','#052C6E'])
hex_plot(x_tot,y_tot,34840)
Thank you.
I've written a bit of code that does what you're after. From the snippet in your question, it looks like you already know the height (max_v) of your distribution given your binning scheme, so I worked under that assumption. Depending on the data you're applying this to, this might not actually be the case, in which case the following will fail (it's only as good as your guess/knowledge of the height of the distributions). For the purposes of my example data, I've just taken a reasonable guess (based on a quick plot) for the values of max_v1 and max_v2. Switching the c1 and c2 I've defined for the commented versions should reproduce your original problem.
import scipy
import matplotlib.pyplot as pyplot
import matplotlib.colors
import math
#need to know the height of the distributions a priori
max_v1 = 850 #approximate height of distribution 1 (defined below) with binning defined below
max_v2 = 400 #approximate height of distribution 2 (defined below) with binning defined below
max_v = max(max_v1,max_v2)
#make 2 differently sized datasets (so will require different normalizations)
#all normal distributions with assorted means/variances
x1 = scipy.randn(50000)/6.0+0.5
y1 = scipy.randn(50000)/3.0+0.5
x2 = scipy.randn(100000)/2.0-0.5
y2 = scipy.randn(100000)/2.0-0.5
#c1 = scipy.ones(len(x1)) #I don't assign meaningful weights here
#c2 = scipy.ones(len(x2)) #I don't assign meaningful weights here
c1 = scipy.ones(len(x1))*(max_v/max_v1) #highest distribution: no net change in normalization here
c2 = scipy.ones(len(x2))*(max_v/max_v2) #renormalized to same height as highest distribution
#define plot boundaries
xmin=-2.0
xmax=2.0
ymin=-2.0
ymax=2.0
#custom colormap
cmap = matplotlib.colors.ListedColormap(['grey','#6A92D4','#1049A9','#052C6E'])
#the bounds of 1sigma, 2sigma, etc. regions
bounds = [0,max_v*math.exp(-(3**2)/2),max_v*math.exp(-2),max_v*math.exp(-0.5),max_v]
norm = matplotlib.colors.BoundaryNorm(bounds, ncolors=4)
#make the hexbin plot
normalized = pyplot
hexplot = normalized.subplot(111)
normalized.hexbin(scipy.concatenate((x1,x2)), scipy.concatenate((y1,y2)), C=scipy.concatenate((c1,c2)), cmap=cmap, mincnt=1, extent=(xmin,xmax,ymin,ymax),gridsize=50, reduce_C_function=scipy.sum, norm=norm) #combine distributions and weights
hexplot.axis([xmin,xmax,ymin,ymax])
cax = pyplot.axes([0.86, 0.1, 0.03, 0.85])
clims = cax.axis()
cb = normalized.colorbar(cax=cax)
cax.set_yticklabels([' ','3','2','1',' '])
normalized.subplots_adjust(wspace=0, hspace=0, bottom=0.1, right=0.78, top=0.95, left=0.12)
normalized.show()
Here's the result without the fix (commented c1 and c2 used),

and the result with the fix (code as-is);

Hope that helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With