Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DataFrame.equals() returns False when comparing data frames with the same content but initialized differently

The following code is supposedly creating two identical data frames, but the test for equality returns False:

import pandas as pd

df1 = pd.DataFrame(columns=["A"])
df2 = pd.DataFrame({"A": []})
print(df1)
print(df2)
print(df1.equals(df2))

Here is the output produced by the code above:

Command Line Arguments
   
Empty DataFrame
Columns: [A]
Index: []
Empty DataFrame
Columns: [A]
Index: []
False

Why does df1.equals(df2) return False?

like image 898
jvisprime Avatar asked May 12 '26 04:05

jvisprime


1 Answers

There is a method for testing equality with more detail:

import pandas as pd
from pandas.testing import assert_frame_equal

df1 = pd.DataFrame(columns=["A"])
df2 = pd.DataFrame({"A": []})

assert_frame_equal(df1,df2)

Output

DataFrame.index classes are not equivalent
[left]:  Index([], dtype='object')
[right]: RangeIndex(start=0, stop=0, step=1)

Then

assert_frame_equal(df1.reset_index(drop=True),df2.reset_index(drop=True))

Output

Attribute "dtype" are different
[left]:  object
[right]: float64

Finally, this will get you there

df1.reset_index(drop=True).equals(df2.astype(object).reset_index(drop=True))
like image 160
Chris Avatar answered May 13 '26 19:05

Chris



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!