Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy mask input data with missing values

I'm loading data from a csv using loadtxt where all values are floats with the exception of missing data which is coded as the character "?".

I'm trying to create a masked array such that I can use np.ma functions on the loaded data where the missing data will be ignored for the purpose of averages, etc. I've read the documentation for masked_array and this is probably incredibly trivial but I can't seem to figure out how to mask the array such that ? are ignored for the purpose of np.ma mathematical functions.

like image 703
engil Avatar asked Oct 14 '25 08:10

engil


1 Answers

You can simply use np.genfromtxt() to read the files and mask the resulting nan values. For example:

input:

11, 12, 13, ?, ?, 16
21, 22, ?, 24, ?, 26

code:

a = np.genfromtxt('test.txt', delimiter=',', missing_values='?', usemask=True)

a.sum(axis=1).data
#array([ 52.,  93.])

a.mean()
#18.125
like image 85
Saullo G. P. Castro Avatar answered Oct 17 '25 00:10

Saullo G. P. Castro



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!