Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas boolean index with NaN

I have this toy example which capture my real problem:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['car there is','car not working', 'bus there is']})
df.iloc[1] = np.nan
idx = df['A'].str.contains('car')
df['IsCar'] = 0
df.loc[idx,'IsCar'] = 1

When I try to run this code, I got the following error message:

ValueError: cannot index with vector containing NA / NaN values

Why can I not do this. Is there fix where I not have to replace the NaN with something else?

like image 856
fossekall Avatar asked Oct 20 '25 13:10

fossekall


1 Answers

There is a flag na for str.contains (see docs) which you can set to False, which will provide a fill value for missing values. Simply use

idx = df['A'].str.contains('car', na=False)
like image 188
miradulo Avatar answered Oct 22 '25 03:10

miradulo