I have a dict with item\column name and a df with columns from dict and other columns. How can I add column to df with min value for every item just from columns corresponding from dict?
import pandas as pd
my_dict={'Item1':['Col1','Col3'],
'Item2':['Col2','Col4']
}
df=pd.DataFrame({
'Col0':['Item1','Item2'],
'Col1':[20,25],
'Col2':[89,15],
'Col3':[26,30],
'Col4':[40,108],
'Col5':[55,2]
})
df['min']=?
I tried
df['min']=df[df.columns[df.columns.isin(my_dict)]].min(axis=1),
but it didn't work.
You can use apply with a function that reads the appropriate column names out of the dictionary (returning an empty list if there is no match) and then takes the minimum of the specified columns:
my_dict = {
'Item1': ['Col1', 'Col3'],
'Item2': ['Col2', 'Col4']
}
df['min'] = df.apply(lambda r:r[my_dict.get(r['Col0'], [])].min(), axis=1)
Output:
Col0 Col1 Col2 Col3 Col4 Col5 min
0 Item1 20 89 26 40 55 20
1 Item2 25 15 30 108 2 15
If it's possible my_dict may contain column names that don't exist in the dataframe, you can check for that in the function. For example:
my_dict = {
'Item1': ['Col1', 'Col3'],
'Item2': ['Col4', 'Col6']
}
df['min'] = df.apply(
lambda r:r[[col for col in my_dict.get(r['Col0'], []) if col in r]].min(),
axis=1
)
Output:
Col0 Col1 Col2 Col3 Col4 Col5 min
0 Item1 20 89 26 40 55 20
1 Item2 25 15 30 108 2 108
You can even get the column names if you want:
my_dict = {
'Item1': ['Col1', 'Col3'],
'Item2': ['Col2', 'Col4']
}
df[['min', 'name']] = df.apply(
lambda r:min((r[col], col) for col in my_dict.get(r['Col0'], []) if col in r),
axis=1, result_type='expand'
)
Output:
Col0 Col1 Col2 Col3 Col4 Col5 min name
0 Item1 20 89 26 40 55 20 Col1
1 Item2 25 15 30 108 2 15 Col2
You can use the dictionary to compute an existance table of Item and Col. Then use this table to mask the original dataframe.
my_dict = {
'Item1':['Col1','Col3'],
'Item2':['Col6','Col4']
}
m = pd.crosstab(*(pd.DataFrame(my_dict)
.stack().reset_index(level=1)
.T.to_numpy()))
df['min'] = df.set_index('Col0').where(m.eq(1), float('inf')).min(axis=1).to_list()
print(m)
col_0 Col1 Col3 Col4 Col6
row_0
Item1 1 1 0 0
Item2 0 0 1 1
print(df)
Col0 Col1 Col2 Col3 Col4 Col5 min
0 Item1 20 89 26 40 55 20.0
1 Item2 25 15 30 108 2 108.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With