Convert column values to NaN using np.where

Question

I cannot figure out how to use the index results from np.where in a for loop. I want to use this for loop to ONLY change the values of a column given the np.where index results.

This is a hypothetical example for a situation where I want to find the indexed location of certain problems or anomalies in my dataset, grab their locations with np.where, and then run a loop on the dataframe to recode them as NaN, while leaving every other index untouched.

Here is my simple code attempt so far:

import pandas as pd
import numpy as np

# import iris
df = pd.read_csv('https://raw.githubusercontent.com/rocketfish88/democ/master/iris.csv')

# conditional np.where -- hypothetical problem data
find_error = np.where((df['petal_length'] == 1.6) & 
                  (df['petal_width'] == 0.2))

# loop over column to change error into NA
for i in enumerate(find_error):
    df = df['species'].replace({'setosa': np.nan})

# df[i] is a problem but I cannot figure out how to get around this or an alternative

cs95 · Accepted Answer

You can directly assign to the column:

m = (df['petal_length'] == 1.6) & (df['petal_width'] == 0.2)
df.loc[m, 'species'] = np.nan

Or, fixing your code.

df['species'] = np.where(m, np.nan, df['species'])

Or, using Series.mask:

df['species'] = df['species'].mask(m)

Convert column values to NaN using np.where

Tags:

python

python-3.x

pandas

numpy

John Stud

1 Answers

cs95

Recent Activity

Donate For Us

Convert column values to NaN using np.where

Tags:

python

python-3.x

pandas

numpy

John Stud

1 Answers

cs95

Related questions

Recent Activity

Donate For Us