Apologies for the opaque question name (not sure how to word it). I have the following dataframe:
import pandas as pd
import numpy as np
data = [['tom', 1,1,6,4],
['tom', 1,2,2,3],
['tom', 1,2,3,1],
['tom', 2,3,2,7],
['jim', 1,4,3,6],
['jim', 2,6,5,3]]
df = pd.DataFrame(data, columns = ['Name', 'Day','A','B','C'])
df = df.groupby(by=['Name','Day']).agg('sum').reset_index()
df

I would like to add another column that returns text according to which column of A,B,C is the highest:
For example I would like Apple if A is highest, Banana if B is highest, and Carrot if C is highest. So in the example above the values for the 4 columns should be:
New Col
Carrot
Apple
Banana
Carrot
Any help would be much appreciated! Thanks
Use DataFrame.idxmax along axis=1 with Series.map:
dct = {'A': 'Apple', 'B': 'Banana', 'C': 'Carrot'}
df['New col'] = df[['A', 'B', 'C']].idxmax(axis=1).map(dct)
Result:
Name Day A B C New col
0 jim 1 4 3 6 Carrot
1 jim 2 6 5 3 Apple
2 tom 1 5 11 8 Banana
3 tom 2 3 2 7 Carrot
@ShubhamSharma's answer is better than this, but here is another option:
df['New col'] = np.where((df['A'] > df['B']) & (df['A'] > df['C']), 'Apple', 'Carrot')
df['New col'] = np.where((df['B'] > df['A']) & (df['B'] > df['C']), 'Banana', df['New col'])
output:
Name Day A B C New col
0 jim 1 4 3 6 Carrot
1 jim 2 6 5 3 Apple
2 tom 1 5 11 8 Banana
3 tom 2 3 2 7 Carrot
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With