Let's say I have a dataframe:
    A   B   C   D   E   F
0   x   R   i   R   nan h
1   z   g   j   x a   nan
2   z   h   nan y nan nan
3   x   g   nan nan nan nan
4   x   x   h   x   s   f
I want to replace all the cells where:
df.loc[0] == 'R')!= 'x')with np.nan.
Essentially I want to do:
df.loc[2:,df.loc[0]=='R']!='x' = np.nan
I get the error:
SyntaxError: can't assign to comparison
I just don't know how the syntax is supposed to be.
I've tried
df[df.loc[2:,df.loc[0]=='R']!='x']
but this doesn't list the values I want.
The fillna() function is used to fill NA/NaN values using the specified method. Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the dict/Series/DataFrame will not be filled.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
You can use the fillna() function to replace NaN values in a pandas DataFrame.
In computing, NaN (/næn/), standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmetic.
mask = df.ne('x') & df.iloc[0].eq('R')
mask.iloc[:2] = False
df.mask(mask)
   A    B    C    D    E    F
0  x    R    i    R  NaN    h
1  z    g    j    x    a  NaN
2  z  NaN  NaN  NaN  NaN  NaN
3  x  NaN  NaN  NaN  NaN  NaN
4  x    x    h    x    s    f
Build the mask up
df.ne('x') gives
        A      B     C      D     E     F
 0  False   True  True   True  True  True
 1   True   True  True  False  True  True
 2   True   True  True   True  True  True
 3  False   True  True   True  True  True
 4  False  False  True  False  True  True
But we want that in conjunction with df.iloc[0].eq('R') which is a Series.  Turns out that if we just & those two together, it will align the Series index with the columns of the mask in step 1.
 A    False
 B     True
 C    False
 D     True
 E    False
 F    False
 Name: 0, dtype: bool
 # &
        A      B     C      D     E     F
 0  False   True  True   True  True  True
 1   True   True  True  False  True  True
 2   True   True  True   True  True  True
 3  False   True  True   True  True  True
 4  False  False  True  False  True  True
 # GIVES YOU
        A      B      C      D      E      F
 0  False   True  False   True  False  False
 1  False   True  False  False  False  False
 2  False   True  False   True  False  False
 3  False   True  False   True  False  False
 4  False  False  False  False  False  False
Finally, we want to exclude the first two rows from these shenanigans so...
 mask.iloc[:2] = False
Try with:
mask = df.iloc[0] !='R'
df.loc[2:, mask] = df.loc[2:,mask].where(df.loc[2:,mask]=='x')
Output:
     A  B    C    D    E    F
0    x  R    i    R  NaN    h
1    z  g    j    x    a  NaN
2  NaN  h  NaN    y  NaN  NaN
3    x  g  NaN  NaN  NaN  NaN
4    x  x  NaN    x  NaN  NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With