I'm trying to use a regex to clean some data before I insert the items into the database. I haven't been able to solve the issue of removing trailing special characters at the end of my strings.
How do I write this regex to only remove trailing special characters?
import re
strings = ['string01_','str_ing02_^','string03_@_', 'string04_1', 'string05_a_']
for item in strings:
clean_this = (re.sub(r'([_+!@#$?^])', '', item))
print (clean_this)
outputs this:
string01 # correct
string02 # incorrect because it remove _ in the string
string03 # correct
string041 # incorrect because it remove _ in the string
string05a # incorrect because it remove _ in the string and not just the trailing _
You could also use the special purpose rstrip method of strings
[s.rstrip('_+!@#$?^') for s in strings]
# ['string01', 'str_ing02', 'string03', 'string04_1', 'string05_a']
You could repeat the character class 1+ times or else only 1 special character would be replaced. Then assert the end of the string $. Note that you don't need the capturing group around the character class:
[_+!@#$?^]+$
For example:
import re
strings = ['string01_','str_ing02_^','string03_@_', 'string04_1', 'string05_a_']
for item in strings:
clean_this = (re.sub(r'[_+!@#$?^]+$', '', item))
print (clean_this)
See the Regex demo | Python demo
If you also want to remove whitespace characters at the end you could add \s to the character class:
[_+!@#$?^\s]+$
Regex demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With