Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

string column manipulation in Data Frame in pandas

I have a string column (Time) in a data frame like this. I want to put underscore between digits and remove months.

Time
2- 3 months          
1- 2 months          
10-11 months          
4- 5 months
 Desired output:
2_3           
1_2           
10_11           
4_5 

Here is what I am trying but does not seem working.

def func(string):
    a_new_string =string.replace('- ','_')
    a_new_string1 =a_new_string.replace('-','_')
    a_new_string2= a_new_string1.rstrip(' months')
    return a_new_string2

And applying function to data frame.

df['Time'].apply(func)
like image 263
Alph Avatar asked Oct 15 '25 06:10

Alph


1 Answers

One option is to use 3 str replace calls:

In [18]:

df['Time'] = df['Time'].str.replace('- ', '_')
df['Time'] = df['Time'].str.replace('-', '_')
df['Time'] = df['Time'].str.replace(' months', '')
df
Out[18]:
    Time
0    2_3
1    1_2
2  10_11
3    4_5

I think your problem maybe that you're not assigning the result of your apply back:

In [21]:

def func(string):
    a_new_string =string.replace('- ','_')
    a_new_string1 =a_new_string.replace('-','_')
    a_new_string2= a_new_string1.rstrip(' months')
    return a_new_string2

df['Time'] = df['Time'].apply(func)
df
Out[21]:
    Time
0    2_3
1    1_2
2  10_11
3    4_5

You could also make this a one liner:

In [25]:

def func(string):
    return string.replace('- ','_').replace('-','_').rstrip(' months')

df['Time'] = df['Time'].apply(func)
df
Out[25]:
    Time
0    2_3
1    1_2
2  10_11
3    4_5
like image 171
EdChum Avatar answered Oct 16 '25 20:10

EdChum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!