Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the type of pd.DataFrame every items is float, but the dtype of pd.DataFrame is object?

results_table is a pd.DataFrame

When I

print(type(results_table.loc[0,'Mean recall score']))

it return

<class 'numpy.float64'>

Every items is float

But when I

print(results_table['Mean recall score'].dtype)

it returns

object

Why is there such behavior?

like image 925
SIRIUS Avatar asked Dec 21 '25 00:12

SIRIUS


1 Answers

First note df.loc[0, x] only considers the value in row label 0 and column label x, not your entire dataframe. Now let's consider an example:

df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)

print(type(df.loc[0, 'A']))  # type of single element in series

# <class 'float'>

print(df['A'].dtype)         # type of series

# object

As you can see, an object dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:

print(df['A'].map(type))

# 0    <class 'float'>
# 1      <class 'str'>
# 2      <class 'str'>
# 3      <class 'int'>
# Name: A, dtype: object

An object dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list and explains why performance is poor when you work with object instead of numeric series.

See also this answer for a visual respresentation of the above.

like image 170
jpp Avatar answered Dec 22 '25 15:12

jpp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!