In pandas I can search and replace all fields that contain the word fish, for example, using df.replace(r'.*fish.*', 'foo', regex = True)
.
But how do I search and replace all fields that don't contain the word fish?
That is in my example replace all fields that don't contain the word fish with the word 'foo'.
For example, say the dataframe is
applefish pear
water afishfarm
I would like this to be transformed to
applefish foo
foo afishfarm
You can use negative look ahead (?!
) assertion; ^(?!.*fish).*$
will firstly assert the pattern doesn't contain the word fish
and then match every thing till the end of string and replace it with foo
:
^
denotes the beginning of string, combined with (?!.*fish)
, it asserts at BOS that there is no pattern like .*fish
in the string;.*$
, and replace it with foo
; If the assertion fails, the pattern doesn't match, nothing would happen;so:
df.replace(r'^(?!.*fish).*$', 'foo', regex=True)
# 0 1
#0 applefish foo
#1 foo afishfarm
If the string can contain multiple words:
df
# 0 1
#0 applefish pear pear
#1 water afishfarm
You can use word boundary \b
to replace ^
and word characters \w
to replace .
:
df.replace(r'\b(?!.*fish)\w+', 'foo', regex=True)
# 0 1
#0 applefish foo foo
#1 foo afishfarm
You can use apply with str.contains
df.apply(lambda x: x.replace(x[~x.str.contains('fish')], 'foo'))
You get
0 1
0 applefish foo
1 foo afishfarm
Note: I wouldn't even recommend this as Psidom's solution is way more efficient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With