Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I apply shift from within a pandas function?

I am trying to build a function that uses .shift() but it is giving me an error. Consider this:

In [40]:

data={'level1':[20,19,20,21,25,29,30,31,30,29,31],
      'level2': [10,10,20,20,20,10,10,20,20,10,10]}
index= pd.date_range('12/1/2014', periods=11)
frame=DataFrame(data, index=index)
frame

Out[40]:
            level1 level2
2014-12-01  20  10
2014-12-02  19  10
2014-12-03  20  20
2014-12-04  21  20
2014-12-05  25  20
2014-12-06  29  10
2014-12-07  30  10
2014-12-08  31  20
2014-12-09  30  20
2014-12-10  29  10
2014-12-11  31  10

A normal function works fine. To demonstrate I calculate the same result twice, using the direct and function approach:

In [63]:
frame['horizontaladd1']=frame['level1']+frame['level2']#works

def horizontaladd(x):
    test=x['level1']+x['level2']
    return test
frame['horizontaladd2']=frame.apply(horizontaladd, axis=1)
frame
Out[63]:
            level1 level2 horizontaladd1 horizontaladd2
2014-12-01  20  10  30  30
2014-12-02  19  10  29  29
2014-12-03  20  20  40  40
2014-12-04  21  20  41  41
2014-12-05  25  20  45  45
2014-12-06  29  10  39  39
2014-12-07  30  10  40  40
2014-12-08  31  20  51  51
2014-12-09  30  20  50  50
2014-12-10  29  10  39  39
2014-12-11  31  10  41  41

But while directly applying shift works, in a function it doesn't work:

frame['verticaladd1']=frame['level1']+frame['level1'].shift(1)#works

def verticaladd(x):
    test=x['level1']+x['level1'].shift(1)
    return test
frame.apply(verticaladd)#error

results in

KeyError: ('level1', u'occurred at index level1')

I also tried applying to a single column which makes more sense in my mind, but no luck:

def verticaladd2(x):
    test=x-x.shift(1)
    return test
frame['level1'].map(verticaladd2)#error, also with apply

error:

AttributeError: 'numpy.int64' object has no attribute 'shift'

Why not call shift directly? I need to embed it into a function to calculate multiple columns at the same time, along axis 1. See related question Ambiguous truth value with boolean logic

like image 850
DISC-O Avatar asked Dec 01 '25 08:12

DISC-O


1 Answers

Try passing the frame to the function, rather than using apply (I am not sure why apply doesn't work, even column-wise):

def f(x):
    x.level1 
    return x.level1 + x.level1.shift(1)

f(frame)

returns:

2014-12-01   NaN
2014-12-02    39
2014-12-03    39
2014-12-04    41
2014-12-05    46
2014-12-06    54
2014-12-07    59
2014-12-08    61
2014-12-09    61
2014-12-10    59
2014-12-11    60
Freq: D, Name: level1, dtype: float64
like image 88
JAB Avatar answered Dec 03 '25 22:12

JAB



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!