Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two columns using pandas 2

I'm comparing two columns in a dataframe (A & B). I have a method that works (C5). It came from this question: Compare two columns using pandas

I wondered why I couldn't get the other methods (C1 - C4) to give the correct answer:

df = pd.DataFrame({'A': [1,1,1,1,1,2,2,2,2,2],
                   'B': [1,1,1,1,1,1,0,0,0,0]})

#df['C1'] = 1 [df['A'] == df['B']]

df['C2'] = df['A'].equals(df['B'])

df['C3'] = np.where((df['A'] == df['B']),0,1)

def fun(row):
    if ['A'] == ['B']:
        return 1
    else:
        return 0
df['C4'] = df.apply(fun, axis=1)

df['C5'] = df.apply(lambda x : 1 if x['A'] == x['B'] else 0, axis=1)

enter image description here

like image 577
R. Cox Avatar asked Nov 17 '25 04:11

R. Cox


1 Answers

Use:

df = pd.DataFrame({'A': [1,1,1,1,1,2,2,2,2,2],
                   'B': [1,1,1,1,1,1,0,0,0,0]})

So for C1 and C2 need compare columns by == or eq for boolean mask and then convert it to integers - True, False to 1,0:

df['C1'] = (df['A'] == df['B']).astype(int)
df['C2'] = df['A'].eq(df['B']).astype(int)

Here is necessary change order 1,0 - for match condition need 1:

df['C3'] = np.where((df['A'] == df['B']),1,0)

In function is not selected values of Series, missing row:

def fun(row):
    if row['A'] == row['B']:
        return 1
    else:
        return 0
df['C4'] = df.apply(fun, axis=1)

Solution is correct:

df['C5'] = df.apply(lambda x : 1 if x['A'] == x['B'] else 0, axis=1)
print (df)
   A  B  C1  C2  C3  C4  C5
0  1  1   1   1   1   1   1
1  1  1   1   1   1   1   1
2  1  1   1   1   1   1   1
3  1  1   1   1   1   1   1
4  1  1   1   1   1   1   1
5  2  1   0   0   0   0   0
6  2  0   0   0   0   0   0
7  2  0   0   0   0   0   0
8  2  0   0   0   0   0   0
9  2  0   0   0   0   0   0
like image 192
jezrael Avatar answered Nov 19 '25 19:11

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!