Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle non-numeric entries in an integer valued column

Tags:

python

pandas

I have a dataframe df and one of the columns count is contains strings. These strings are mostly convertable to integers (e.g. 0006) which is what I will do with them. However some of the entries in count are blank strings of spaces. How can I

  • Drop all the rows where the count value is a blank string.
  • Substitute all the blank values in that column with some numeric value of my choice.

The dataframe is very large if there are particularly efficient ways of doing this.

like image 347
graffe Avatar asked Dec 01 '25 17:12

graffe


1 Answers

It seems that you want two different things. But first, convert column to numeric and coerce errors:

df['count'] = pd.to_numeric(df['count'], errors='coerce')

To drop rows (use subset to avoid dropping NaN from other columns):

df.dropna(subset=['count'])

To replace with default value:

df['count'] = df['count'].fillna(default_value)
like image 193
IanS Avatar answered Dec 04 '25 08:12

IanS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!