I have the following pandas dataframe:
| key1 | key2 | col_name | bool | col_1 | col_2 | col_3 |
|---|---|---|---|---|---|---|
| a1 | a2 | col_1 | 0 | 5 | 10 | 20 |
| b1 | b2 | col_3 | 1 | 10 | 10 | 5 |
| c1 | c2 | col_1 | 1 | 5 | 15 | 5 |
Where bool==1, I would like to update the value in the column given by the col_name column to be 100.
Expected output:
| key1 | key2 | col_name | bool | col_1 | col_2 | col_3 |
|---|---|---|---|---|---|---|
| a1 | a2 | col_1 | 0 | 5 | 10 | 20 |
| b1 | b2 | col_3 | 1 | 10 | 10 | 100 |
| c1 | c2 | col_1 | 1 | 100 | 15 | 5 |
I can do this by iterating through the table, but from what I've read this is never best practice. What would be the most efficient way of doing this?
Build a boolean mask with numpy and update:
# identify cells for which the col_name matches the column name
# only keep those that have a bool of 1 in the row
m = ((df['col_name'].to_numpy()[:, None] == df.columns.to_numpy())
& df['bool'].eq(1).to_numpy()[:, None]
)
df[m] = 100
Output:
key1 key2 col_name bool col_1 col_2 col_3
0 a1 a2 col_1 0 5 10 20
1 b1 b2 col_3 1 10 10 100
2 c1 c2 col_1 1 100 15 5
Intermediate m:
array([[False, False, False, False, False, False, False],
[False, False, False, False, False, False, True],
[False, False, False, False, True, False, False]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With