Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace values in array using mask and other array

I have a 1D "from"-array (call it "frm") containing values with an associated Boolean mask-array: "mask" (same shape as frm). Then I have a third "replace" array: "repl", also 1D but shorter in length than the other two.

With these, I would like to generate a new array ("to") which contains the frm values except where mask==True in which case it should take in-order the values from repl. (Note that the number of True elements in mask equals the length of repl).

I was looking for a "clever" numpy way of implementing this? I looked at methods like np.where, np.take, np.select, np.choose but none seem to "fit the bill"?

"Cutting to the code", here's what I have thus far. It works fine but doesn't seem "Numpythonic"? (or even Pythonic for that matter)

frm  = [1, 2, 3, 4, 5]
mask = [False, True, False, True, True]
repl = [200, 400, 500]
i = 0; to = []
for f,m in zip(frm,mask):
    if m:
        to.append(repl[i])
        i += 1
    else:
        to.append(f)
print(to)

Yields: [1, 200, 3, 400, 500]

(Background: the reason I need to do this is because I'm subclassing Pandas pd.Dataframe class and need a "setter" for the Columns/Index. As pd.Index cannot be "sliced indexed" I need to first copy the index/column array, replace some of the elements in the copy based on the mask and then have the setter set the complete new value. Let me know if anyone would know a more elegant solution to this).

like image 929
Hans Bouwmeester Avatar asked Oct 29 '25 22:10

Hans Bouwmeester


1 Answers

numpy solution:

Its pretty straightforward like this:

# convert frm to a numpy array:
frm = np.array(frm)
# create a copy of frm so you don't modify original array:
to = frm.copy()

# mask to, and insert your replacement values:
to[mask] = repl

Then to returns:

>>> to
array([  1, 200,   3, 400, 500])

pandas solution:

if your dataframe looks like:

>>> df
   column
0       1
1       2
2       3
3       4
4       5

Then you can use loc:

df.loc[mask,'column'] = repl

Then your dataframe looks like:

>>> df
   column
0       1
1     200
2       3
3     400
4     500
like image 153
sacuL Avatar answered Nov 01 '25 13:11

sacuL



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!