I'm trying to find missing values and then drop off missing values. Tried looking for the data online but can't seem to find the answer.
Extracted Dataframe:
In the df, for 1981 and 1982, it should be '-', i.e. missing values. I would like to find the missing values then drop off the missing values.
Exported Dataframe using isnull:
I used df.isnull() but in 1981 and 1982, it's detected as 'False' which means there's data. But it should be '-', therefore considered as missing values.
I had pasted my code below. What am I missing out?
import pandas as pd
mydf = pd.read_excel('abc.xlsx', sep='\t')
df1 = mydf.set_index('Variables')
df = df1[0:10]
print(df)
print(df.isnull())
The question has two points: finding which columns have missing values and drop those values.
To find the missing values on a dataframe df
missing = df.isnull().sum()
print(missing)
To drop those missing values, apart from @jezrael's consideration, if that doesn't help, I suggest you to use dropna
:
Drop the rows where all elements are missing.
df.dropna(how='all')
Drop the columns where at least one element is missing.
df.dropna(axis='columns')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With