Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using replace and str.startswith() in a pandas dataframe to rename values

I have a column called source which contains couple hundred rows of text. The thing is that some of these can be grouped together and I'm struggling to do that in the Pandas dataframe. Here's my code:

df.source.replace({
                   df.source.str.startswith('share', na=False): 'sharePet',
                   df.source.str.startswith('2012-01-08', na=False): 'shareDate'

                 })

Additionally, Will this work for the second line which starts with dates? if not I can keep it for the first line and other groupings that are text.

Would love some advice.

like image 691
user8322222 Avatar asked Sep 07 '25 03:09

user8322222


1 Answers

You can use a dictionary and iterate:

d = {'share': 'sharePet', '2012-01-08': 'shareDate'}

for k, v in d.items():
    df.loc[df['source'].str.startswith(k, na=False), 'source'] = v

Pandas str.startswith works only for strings. You can check easily what types exist in your series via set(map(type, df['source'])).

like image 81
jpp Avatar answered Sep 09 '25 21:09

jpp