Formatting "Kilo", "Mega", "Gig" data in numpy record array

Question

I am trying to plot something which is in this csv format: timestamp, value. But the values are not real numbers but rather abbreviations of large values (k = 1000, M = 1000000 etc).

2012-02-24 09:07:01, 8.1M
2012-02-24 09:07:02, 64.8M
2012-02-24 09:07:03, 84.8M
2012-02-24 09:07:04, 84.8M
2012-02-24 09:07:05, 84.8M
2012-02-24 09:07:07, 84.8M
2012-02-24 09:07:08, 84.8M
2012-02-24 09:07:09, 84.8M
2012-02-24 09:07:10, 84.8M

I usually use numpy record array to store the csv using matplotlib.mlab.csv2rec(infile). But works only if the values are not in abbreviated form. Is there an easy way to do this without actually my program reading each value, looking for 'M' to convert 84.8M to 84800000?

Niklas B. · Accepted Answer

Another possibility is the following conversion function:

conv = dict(zip('kMGT', (3, 6, 9, 12)))
def parse_number(value):
  if value[-1] in conv:
    value = '{}e{}'.format(value[:-1], conv[value[-1]])
  return float(value)

Example:

>>> parse_number('1337')
1337.0
>>> parse_number('8.1k')
8100.0
>>> parse_number('8.1M')
8100000.0
>>> parse_number('64.367G')
64367000000.0

bmu · Answer

You could use the function by Niklas B. in the convertd argument of csv2rec:

>>> data = mlab.csv2rec(infile, names=['datetime', 'values'],
...                     convertd={'values': parse_number})
>>> data
rec.array([(datetime.datetime(2012, 2, 24, 9, 7, 1), 8100000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 2), 64800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 3), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 4), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 5), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 7), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 8), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 9), 84800000.0),
   (datetime.datetime(2012, 2, 24, 9, 7, 10), 84800000.0)], 
  dtype=[('datetime', '|O8'), ('values', '<f8')])

Formatting "Kilo", "Mega", "Gig" data in numpy record array

Tags:

python

matplotlib

numpy

Ritesh

2 Answers

Niklas B.

bmu

Recent Activity

Donate For Us

Formatting "Kilo", "Mega", "Gig" data in numpy record array

Tags:

python

matplotlib

numpy

Ritesh

2 Answers

Niklas B.

bmu

Related questions

Recent Activity

Donate For Us