If we have the following list in Python
sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]
How do I split this to get a list which contains elements that finish with the full stop? So i want to get the following elements in my new list:
["I","am","good","."]
["I","like","you","."]
["we","are","not","friends","."]
My attempts so far:
cleaned_sentence = []
a = 0
while a < len(sentence):
current_word = sentence[a]
if current_word == "." and len(cleaned_sentence) == 0:
cleaned_sentence.append(sentence[0:sentence.index(".")+1])
a += 1
elif current_word == "." and len(cleaned_sentence) > 0:
sub_list = sentence[sentence.index(".")+1:-1]
sub_list.append(sentence[-1])
cleaned_sentence.append(sub_list[0:sentence.index(".")+1])
a += 1
else:
a += 1
for each in cleaned_sentence:
print(each)
Running this on sentence produces
['I', 'am', 'good', '.']
['I', 'like', 'you', '.']
['I', 'like', 'you', '.']
You can use itertools.groupby:
from itertools import groupby
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip(i, i)])
This outputs:
[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]
If your list doesn't always end with '.' then you can use itertools.zip_longest instead:
sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends"]
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip_longest(i, i, fillvalue=[])])
This outputs:
[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With