Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check for datatypes of columns in pandas

I have an excel file which I'm importing as a pandas dataframe.

My dataframe df:

id    name    value
1      abc     22.3
2      asd     11.9
3      asw     2.4

I have a dictionary d in format:

{ 'name' : 'str',
  'value' : 'float64',
  'id' : 'int64'}

I want to check whether the data types of the columns in my dataframe is the same as defined in the dictionary.

Output can be just a string like, if all the columns have their respective data type,

print("Success")
else:
    print(" column id has different data type.Please check your file)"
like image 901
Shubham R Avatar asked Oct 17 '25 15:10

Shubham R


2 Answers

Call dtypes, convert to a dictionary and compare.

d1 = df.dtypes.astype(str).to_dict()

d1
{'id': 'int64', 'name': 'object', 'value': 'float64'}

d1 == {'name' : 'str', 'value' : 'float64', 'id' : 'int64'}
False 

Unfortunately, name is shown to be an object column, not str, hence the False. I could recommend doing a quick iteration over your dict and changing all entries where str appears to object (this shouldn't hurt):

d2 = {k : 'object' if v == 'str' else v for k, v in d2.items()}

d2
{'id': 'int64', 'name': 'object', 'value': 'float64'}

d1 == d2
True

To check which column(s) are incorrect, the solution becomes a little more involved, but is still quite easy with a list comprehension.

[k for k in d1 if d1[k] != d2.get(k)] 
['name']
like image 98
cs95 Avatar answered Oct 20 '25 06:10

cs95


Use

In [5759]: s = df.dtypes == pd.Series(d)

In [5760]: ss = s[~s]

In [5761]: if ss.empty:
      ...:     print('sucess')
      ...: else:
      ...:     print ('columns %s have different data type' % ss.index.tolist())
      ...:
      ...:
columns ['name'] have different data type

Details

In [5763]: df
Out[5763]:
   id name  value
0   1  abc   22.3
1   2  asd   11.9
2   3  asw    2.4

In [5764]: d
Out[5764]: {'id': 'int64', 'name': 'str', 'value': 'float64'}
like image 40
Zero Avatar answered Oct 20 '25 06:10

Zero