Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I replace missing boolean values using python?

In my dataset, one of the columns is a boolean value, and there are missing values within the dataset and within other continuous variable columns which are successfully replaced with their mean. But the mean value can not be replaced for missing boolean. So how can I replace those values?

Note that the boolean is 1 or 0 in my dataset.

Below is the code for replacing continuous missing values:

from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(x)
x = imputer.transform(x)

Thank You

like image 945
Usama Waseem Avatar asked Apr 30 '26 05:04

Usama Waseem


1 Answers

there are several methods to attack this issue.

  1. if you can afford it (if you have enough data) exclude those lines
  2. replace those lines with the majority value (same as replacing with mean of continuous value)
  3. for time series - replace the cell with mean of x cells before and after and set a threshold which above it - the mean will become 0, else , the mean will become 0
like image 126
Michael Fleicher Tal Avatar answered May 02 '26 19:05

Michael Fleicher Tal