Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, pandas: replacing values in one DF by same-index values from another DF

I have two data frames with exactly the same index:

the first one:
           0         1         2
2   0.011765  0.490196  0.015686
2   0.011765  0.490196  0.015686
2   0.007843  0.494118  0.007843
2   0.007843  0.494118  0.007843
2   0.007843  0.501961  0.011765
..       ...       ...       ...

0   0.000000  0.031373  0.039216
0   0.031373  0.082353  0.105882
0   0.094118  0.149020  0.192157
0   0.094118  0.156863  0.215686

[337962 rows x 3 columns]

and the second one:

          0         1         2
0  0.055852  0.118138  0.052386
1  0.453661  0.665857  0.441551
2  0.096394  0.635641  0.068524
3  0.952545  0.827438  0.047632
4  0.787729  0.823494  0.795792
5  0.050284  0.549379  0.592593
6  0.608805  0.215458  0.068293
7  0.775640  0.091352  0.689224

The first DF is quite huge. I need to replace values in huge DF by values with same index in small DF as quickly as possible. How? Thanks for any help.

like image 789
Stanislav Pogodin Avatar asked Oct 18 '25 09:10

Stanislav Pogodin


2 Answers

Use the index of the second dataframe to slice the first one and then assign.

df1.loc[df2.index] = df2
like image 173
Stop harming Monica Avatar answered Oct 20 '25 09:10

Stop harming Monica


You can use merge empty dataframe df1 with df2 by indexes:

print pd.merge(df1[[]], df2, left_index=True, right_index=True)
          0         1         2
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524

Or join:

print df1[[]].join(df2)
          0         1         2
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524

If you need preserved index ordering use merge with reset_index, merge on column index and then set_index:

df = pd.merge(df1[[]].reset_index(), df2.reset_index(), on='index').set_index('index')
df.index.name = None 
print df

          0         1         2
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
2  0.096394  0.635641  0.068524
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
0  0.055852  0.118138  0.052386
like image 44
jezrael Avatar answered Oct 20 '25 11:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!