I have a pandas dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,40,size=(10,4)), columns=range(4), index = range(10))
df.head()
0 1 2 3
0 27 10 13 21
1 25 12 23 8
2 2 24 24 34
3 10 11 11 10
4 0 15 0 27
I'm using the idxmax function to get the columns that contain the maximum value.
df_max = df.idxmax(1)
df_max.head()
0 0
1 0
2 3
3 1
4 3
How can I use df_max along with df, to create a time-series of values corresponding to the maximum value in each row of df? This is the output I want:
0 27
1 25
2 34
3 11
4 27
5 37
6 35
7 32
8 20
9 38
I know I can achieve this using df.max(1), but I want to know how to arrive at this same output by using df_max, since I want to be able to apply df_max to other matrices (not df) which share the same columns and indices as df (but not the same values).
You may try df.lookup
df.lookup(df_max.index, df_max)
Out[628]: array([27, 25, 34, 11, 27], dtype=int64)
If you want Series/DataFrame, you pass the output to the Series/DataFrame constructor
pd.Series(df.lookup(df_max.index, df_max), index=df_max.index)
Out[630]:
0 27
1 25
2 34
3 11
4 27
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With