I am trying to remove ending 'OF' from a column in the pandas dataframe. I tried 'rstrip', 'split', but it also removes 'O' and 'F', I just need to remove 'OF'. How to do that? Not sure why rstrip removes 'O' and 'F' when I have specifically passed 'OF'. Sorry if this question was asked before, I just couldn't find one yet. Thanks.
Sample Data:
l1 = [1,2,3,4]
l2 = ['UNIVERSITY OF CONN. OF','ONTARIO','UNIV. OF TORONTO','ALASKA DEPT.OF']
df = pd.DataFrame({'some_id':l1,'org':l2})
df
some_id org
1 UNIVERSITY OF CONN. OF
2 ONTARIO
3 UNIV. OF TORONTO
4 ALASKA DEPT.OF
Tried:
df.org.str.rstrip('OF')
# df.org.str.split('OF')[0] # Not what I am looking for
Results:
0 UNIVERSITY OF CONN. # works
1 ONTARI # 'O' was removed
2 UNIV. OF TORONT # 'O' was removed
3 ALASKA DEPT. # works
Final output needed:
0 UNIVERSITY OF CONN.
1 ONTARIO
2 UNIV. OF TORONTO
3 ALASKA DEPT.
You can try this regex:
df.org = df.org.str.replace('(OF)$','')
where $ indicates the end of string. Or
df.org.str.rstrip('(OF)')
seems to work as expected.
Output:
0 UNIVERSITY OF CONN.
1 ONTARIO
2 UNIV. OF TORONTO
3 ALASKA DEPT.
Name: org, dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With