Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Split string into characters while excluding a certain substring

I'm trying to split up a string of characters into a list while excluding certain substrings.

For example:

>>> sentences = '<s>I like dogs.</s><s>It\'s Monday today.</s>'
>>> substring1 = '<s>'
>>> substring2 = '</s>'
>>> print(split_string(sentences))
['<s>', 'I', ' ', 'l', 'i', 'k', 'e', ' ', 'd', 'o', 'g', 's', 
'.', '</s>', '<s>', 'I', 't', "'", 's', ' ', 'M', 'o', 'n', 'd',
'a', 'y', ' ', 't', 'o', 'd', 'a', 'y', '.', '</s>']

As you can see, the string is split up into characters, except for the listed substrings. How can I do this in Python?

like image 718
AlexT Avatar asked Oct 27 '25 06:10

AlexT


1 Answers

You could use re.findall for this. :)

import re
sentences = '<s>I like dogs.</s><s>It\'s Monday today.</s>'
print(re.findall(r'<\/?s>|.',sentences))

OUTPUT

['<s>', 'I', ' ', 'l', 'i', 'k', 'e', ' ', 'd', 'o', 'g', 's', '.', '</s>', '<s>', 'I', 't', "'", 's', ' ', 'M', 'o', 'n', 'd', 'a', 'y', ' ', 't', 'o', 'd', 'a', 'y', '.', '</s>']
like image 161
Albin Paul Avatar answered Oct 28 '25 22:10

Albin Paul



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!