I am trying to use numpy.ma.corrcoef to calculate correlations in the presence of missing data.
According to the documentation:
Except for the handling of missing data this function does the same as numpy.corrcoef. For more details and examples, see numpy.corrcoef.
Here is a bivariate dataset, for which only the first and second points have data for both variables.
array([[ 0.00494576, -0.01331578],
[-0.00146498, -0.01349548],
[ 0.00430321, nan],
[-0.00937105, nan],
[ nan, -0.01356873],
[ nan, -0.01375538],
[ nan, -0.00277393],
[ nan, 0.0082988 ],
[ nan, 0. ],
[ nan, 0.00275103],
[ nan, 0.00547947],
[ nan, -0.01375538],
[ nan, 0.0110194 ],
[ nan, -0.00549452],
[ nan, 0.01910017],
[ nan, -0.02462505],
[ nan, -0.01676017],
[ nan, 0.0112046 ],
[ nan, 0.01108045],
[ nan, 0.01639381],
[ nan, 0.01078178],
[ nan, -0.01078178]])
When I cast this as a masked array (np.ma.masked_array(t,np.isnan(t)) where t is the array above) and run np.ma.corrcoef (with rowvar=False) on it the correlation between the variables is given as -86.52 (in absolute value, not percentage!). Whereas running np.corrcoef on the first two points alone produces a correlation of 1 (again absolute value). This latter value is what I think I should expect from the first operation according to the documentation.
My Python version (Enthought 64 bit PyLab on Mac OS X.6.8) information is below and I am using Numpy version 1.6.1.
Python 2.7.3 |EPD 7.3-1 (64-bit)| (default, Apr 12 2012, 11:14:05) Type "copyright", "credits" or "license" for more information.
Please advise on what I am missing here! Thanks in advance.
I think it is probably a bug in numpy.ma.corrcoef (or to be more exact maybe in np.ma.extras._covhelper which I think does not propagate the mask correctly from one column to the other for just single array input, but maybe I was looking at the wrong place).
Use np.ma.corrcoef(b[:,0], b[:,1]) and create a bug report... np.ma.corrcoef(b[:,0], b[:,1]) gives the expected result so its a simple workaround until it is fixed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With