I want to convert a pandas DateTimeIndex to excel dates (the number of days since 12/30/1899).. I tried to use numpy.vectorize on a function that takes datetime64s and returns an excel date. I was surprised by how numpy vectorize behaves - on the first call, a test call to see the return type, vectorize passes in datetime64 as provided. On subsequent calls, it passes in the internal storage type of the datetime64 - in my case a long. Internally, _get_ufunc_and_otypes calls:
inputs = [asarray(_a).flat[0] for _a in args]
outputs = func(*inputs)
While _vectorize_call does the following:
inputs = [array(_a, copy=False, subok=True, dtype=object)
for _a in args]
outputs = ufunc(*inputs)
As it turns out, I could just as easily use the internal numpy array math to do it (x - day0)/1day. But this behavior seems strange (type changing when a function is vectorized)
Here's my sample code:
import numpy
DATETIME64_ONE_DAY = numpy.timedelta64(1,'D')
DATETIME64_DATE_ZERO = numpy.datetime64('1899-12-30T00:00:00.000000000')
def excelDateToDatetime64(x):
return DATETIME64_DATE_ZERO + numpy.timedelta64(int(x),'D')
def datetime64ToExcelDate(x):
print type(x)
return (x - DATETIME64_DATE_ZERO) / DATETIME64_ONE_DAY
excelDateToDatetime64_Array = numpy.vectorize(excelDateToDatetime64)
datetime64ToExcelDate_Array = numpy.vectorize(datetime64ToExcelDate)
excelDates = numpy.array([ 41407.0, 41408.0, 41409.0, 41410.0, 41411.0, 41414.0 ])
datetimes = excelDateToDatetime64_Array(excelDates)
excelDates2 = datetime64ToExcelDate(datetimes)
print excelDates2 # Works fine
# TypeError: ufunc subtract cannot use operands with types dtype('int64') and dtype('<M8[ns]')
# You can see from the print that the type coming in is inconsistent
excelDates2 = datetime64ToExcelDate_Array(datetimes)
Datetimes and timedeltas need to be handled using the underlying data (which you just do arr.view('i8') to get, these are np.int64)
Define your constants in terms of their underlying values
In [94]: DATETIME_DATE_ZERO_VIEW = DATETIME64_DATE_ZERO.view('i8')
In [95]: DATETIME_DATE_ZERO_VIEW
Out[95]: -2209161600000000000
In [96]: DATETIME64_ONE_DAY_VALUE = DATETIME64_ONE_DAY.astype('m8[ns]').item()
In [97]: DATETIME64_ONE_DAY_VALUE
Out[97]: 86400000000000L
In [106]: def vect(x):
.....: return (x-DATETIME_DATE_ZERO_VIEW)/DATETIME64_ONE_DAY_VALUE
.....:
In [107]: f = np.vectorize(vect)
Pass in a view of the underlying np.int64
In [109]: f(datetimes.view('i8'))
Out[109]: array([41407, 41408, 41409, 41410, 41411, 41414])
Pandas way
In [98]: Series(datetimes).apply(lambda x: (x.value-DATETIME_DATE_ZERO_VIEW)/DATETIME64_ONE_DAY_VALUE)
Out[98]:
0 41407
1 41408
2 41409
3 41410
4 41411
5 41414
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With