Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python/pandas: using regular expressions remove anything in square brackets in string

Tags:

python

pandas

Working from a pandas dataframe trying to sanitize a column from something like $12,342 to 12342 and make the column into an int or float. Found one row though with 736[4] so I have to remove everything within the square brackets, brackets included.

Code so far

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace('$','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(',','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(' ','')

The line below is what's supposed to handle and remove the square brackets and intentionally with it's content too.

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'[[^]]*\)','')

To some dev's this is trivial but I've not really used regular expressions often enough to know this and I've also checked around and from one such stack example formulated the above.

like image 230
Sam B. Avatar asked Oct 18 '25 22:10

Sam B.


1 Answers

I think you need:

df2 = pd.DataFrame({'Average Monthly Wage $': ['736[4]','7336[445]', '[4]345[5]']})
print (df2)
  Average Monthly Wage $
0                 736[4]
1              7336[445]
2              [4]345[5]

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'\[.*?\]','')
print (df2)
  Average Monthly Wage $
0                    736
1                   7336
2                    345

regex101.

like image 177
jezrael Avatar answered Oct 20 '25 12:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!