Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a list of features which contain empty values (python/pandas)

Tags:

python

pandas

I am trying to clean a dataset and basically get rid of all the features which have a certain amount of empty values, in more than 100 empty values inclusive, with pandas/python. I am using the following command

train.isnull().sum()>=100 

which gets me:

Id False
Feature 1 False
Feature 2 False 
Feature 3 True
Feature 4 False
Feature 5 True

I would like to return a new dataframe without the features 3 and 4.

Thank you.

like image 471
Liky Avatar asked Jan 21 '26 03:01

Liky


1 Answers

in your case, just run:

train[train.columns[train.isnull().sum()<100]]

Full example:

import pandas as pd
df = pd.DataFrame([[1,None,2],[3,4,None],[7,8,9]], columns = ['A','B','C'])

You'll get:

  A    B     C
0 1    NaN   2.0
1 3    4.0   NaN
2 7    8.0   9.0

then running:

df.isnull().sum()

will result in null count:

A    0
B    1
C    1

then just select the wanted columns:

df.columns[df.isnull().sum()<100]

and filter your data frame:

df[ df.columns[df.isnull().sum()<100]]
like image 177
Dimgold Avatar answered Jan 23 '26 17:01

Dimgold