Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python/Pandas remove specific string from ending

Tags:

python

pandas

I am trying to remove ending 'OF' from a column in the pandas dataframe. I tried 'rstrip', 'split', but it also removes 'O' and 'F', I just need to remove 'OF'. How to do that? Not sure why rstrip removes 'O' and 'F' when I have specifically passed 'OF'. Sorry if this question was asked before, I just couldn't find one yet. Thanks.

Sample Data:

l1 = [1,2,3,4]
l2 = ['UNIVERSITY OF CONN. OF','ONTARIO','UNIV. OF TORONTO','ALASKA DEPT.OF']
df = pd.DataFrame({'some_id':l1,'org':l2})
df

some_id org
1       UNIVERSITY OF CONN. OF
2       ONTARIO
3       UNIV. OF TORONTO
4       ALASKA DEPT.OF

Tried:

df.org.str.rstrip('OF')
# df.org.str.split('OF')[0] # Not what I am looking for

Results:

0    UNIVERSITY OF CONN. # works
1                  ONTARI # 'O' was removed
2         UNIV. OF TORONT # 'O' was removed
3            ALASKA DEPT. # works

Final output needed:

0    UNIVERSITY OF CONN. 
1                  ONTARIO
2         UNIV. OF TORONTO
3            ALASKA DEPT.
like image 830
sharp Avatar asked Mar 17 '26 19:03

sharp


1 Answers

You can try this regex:

df.org = df.org.str.replace('(OF)$','')

where $ indicates the end of string. Or

df.org.str.rstrip('(OF)')

seems to work as expected.

Output:

0    UNIVERSITY OF CONN. 
1                 ONTARIO
2        UNIV. OF TORONTO
3            ALASKA DEPT.
Name: org, dtype: object
like image 165
Quang Hoang Avatar answered Mar 19 '26 14:03

Quang Hoang