Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas elementwise conditional operation over multiple dataframes

I'd like to create a dataframe based on a combination of conditional operations (elementwise) over multiple dataframes with same structure (same index, same columns).

Here I've created three dataframes with similar strutures.

df1 = pd.DataFrame(np.random.rand(5,3), columns=['a','b','c'],index = pd.date_range(start='2000.01.01', periods=5, freq='D'))
df2 = pd.DataFrame(np.random.rand(5,3), columns=['a','b','c'],index = pd.date_range(start='2000.01.01', periods=5, freq='D'))
df3 = pd.DataFrame(np.random.rand(5,3), columns=['a','b','c'],index = pd.date_range(start='2000.01.01', periods=5, freq='D'))

df1
                   a         b         c
2000-01-01  0.457567  0.157506  0.185594
2000-01-02  0.709991  0.486635  0.839173
2000-01-03  0.503184  0.640214  0.895055
2000-01-04  0.940231  0.591708  0.019716
2000-01-05  0.246132  0.596872  0.437000

df2
                   a         b         c
2000-01-01  0.722588  0.696100  0.176172
2000-01-02  0.275177  0.162525  0.347674
2000-01-03  0.248735  0.887237  0.175126
2000-01-04  0.444136  0.337881  0.830616
2000-01-05  0.526365  0.803296  0.574811

df3 
                   a         b         c
2000-01-01  0.392965  0.107987  0.139133
2000-01-02  0.751523  0.658844  0.174854
2000-01-03  0.509276  0.380294  0.406262
2000-01-04  0.669822  0.079491  0.233737
2000-01-05  0.659077  0.094545  0.826730

Here goes my pseudocode:

df4 = if (df1 > 0.5 and df2 <0.3 and df3 > 0.6, 1, 0)

What is the most simple and efficient code for this?

like image 450
Wookeun Lee Avatar asked Mar 03 '26 10:03

Wookeun Lee


1 Answers

pandas

(df1.gt(.5) & df2.lt(.3) & df3.gt(.6)).astype(int)

            a  b  c
2000-01-01  0  0  0
2000-01-02  1  0  0
2000-01-03  0  0  0
2000-01-04  0  0  0
2000-01-05  0  0  0

with some numpy v1

pd.DataFrame(
    (
        (df1.values > .5) &
        (df2.values < .3) &
        (df3.values > .6)
    ).astype(int),
    df1.index, df1.columns
)

            a  b  c
2000-01-01  0  0  0
2000-01-02  1  0  0
2000-01-03  0  0  0
2000-01-04  0  0  0
2000-01-05  0  0  0

with some numpy v2

pd.DataFrame(
    np.where(
        (df1.values > .5) &
        (df2.values < .3) &
        (df3.values > .6), 1, 0
    ),
    df1.index, df1.columns
)

            a  b  c
2000-01-01  0  0  0
2000-01-02  1  0  0
2000-01-03  0  0  0
2000-01-04  0  0  0
2000-01-05  0  0  0
like image 167
piRSquared Avatar answered Mar 06 '26 00:03

piRSquared