Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding iteration in pandas when I want to update the value in a column x when a condition is true where x is given by another column

Tags:

python

pandas

I have the following pandas dataframe:

key1 key2 col_name bool col_1 col_2 col_3
a1 a2 col_1 0 5 10 20
b1 b2 col_3 1 10 10 5
c1 c2 col_1 1 5 15 5

Where bool==1, I would like to update the value in the column given by the col_name column to be 100.

Expected output:

key1 key2 col_name bool col_1 col_2 col_3
a1 a2 col_1 0 5 10 20
b1 b2 col_3 1 10 10 100
c1 c2 col_1 1 100 15 5

I can do this by iterating through the table, but from what I've read this is never best practice. What would be the most efficient way of doing this?

like image 242
user2973447 Avatar asked Oct 30 '25 10:10

user2973447


1 Answers

Build a boolean mask with numpy and update:

# identify cells for which the col_name matches the column name
# only keep those that have a bool of 1 in the row
m = ((df['col_name'].to_numpy()[:, None] == df.columns.to_numpy())
     & df['bool'].eq(1).to_numpy()[:, None]
    )

df[m] = 100

Output:

  key1 key2 col_name  bool  col_1  col_2  col_3
0   a1   a2    col_1     0      5     10     20
1   b1   b2    col_3     1     10     10    100
2   c1   c2    col_1     1    100     15      5

Intermediate m:

array([[False, False, False, False, False, False, False],
       [False, False, False, False, False, False,  True],
       [False, False, False, False,  True, False, False]])
like image 55
mozway Avatar answered Nov 01 '25 01:11

mozway