I have a pandas dataframe, and I would like to create a new column based on an existing column and certain inequalities. For example, let
df=pd.DataFrame({'a':[1,2,3,4,5,6,7],'b':[3,6,4,2,7,7,1]})
so df looks like
a b
0 1 3
1 2 6
2 3 4
3 4 2
4 5 7
5 6 7
6 7 1
I would like to add a new column, res, that equals 0 if the corresponding value in a is smaller than 2, 1 if the corresponding value in a is at least 2 and smaller than 4, and 2 otherwise. So I would like to get
a b res
0 1 3 0
1 2 6 1
2 3 4 1
3 4 2 2
4 5 7 2
5 6 7 2
6 7 1 2
So far I have been doing this by using apply as follows:
def f(x):
if x['a']<2:
return 0
elif x['a']>=2 and x['a']<4:
return 1
else:
return 2
df['res']=df.apply(f,axis=1)
but I was wondering if there is a more direct way, or some specific pandas method that can enable me to do that.
You can use pd.cut:
df['res'] = pd.cut(df.a,[-np.inf,2,4,np.inf],labels=[0,1,2],right=False)
Output:
a b res
0 1 3 0
1 2 6 1
2 3 4 1
3 4 2 2
4 5 7 2
5 6 7 2
6 7 1 2
For just a few values, you can also use numpy.where as a vectorized solution:
df['res'] = pd.np.where(df.a < 2, 0, pd.np.where((df.a >= 2) & (df.a < 4), 1, 2))
df
# a b res
#0 1 3 0
#1 2 6 1
#2 3 4 1
#3 4 2 2
#4 5 7 2
#5 6 7 2
#6 7 1 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With