I have the following list that contain only two characters 'N' and 'C'
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
What I want to do is to extract the consecutive "C"s and return the index in the list.
Yielding something like
chunk1 = [('C', 'C', 'C', 'C'), [3,4,5,6]]
chunk2 = [('C', 'C'), [8,9]]
# and when there's no C it returns empty list.
How can I achieve that in Python?
I tried this but didn't do as I hoped:
from itertools import groupby
from operator import itemgetter
tmp = (list(g) for k, g in groupby(enumerate(ls), itemgetter(1)) if k == 'C')
zip(*tmp)
Move the zip(*...) inside the list comprehension:
import itertools as IT
import operator
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
[list(zip(*g))[::-1]
for k, g in IT.groupby(enumerate(ls), operator.itemgetter(1))
if k == 'C']
yields
[[('C', 'C', 'C', 'C'), (3, 4, 5, 6)], [('C', 'C'), (8, 9)]]
In Python2, list(zip(...)) can be replaced by zip(...), but since in Python3 zip returns an iterator, there we would need list(zip(...)). To make the solution compatible with both Python2 and Python3, use list(zip(...)) here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With