Let's say I have a large list of data that I want to perform some operation on, and I would like to have multiple iterators performing this operation independently.
data = [1,2,3,4,5]
generator = ((e, 2*e) for e in data)
it1 = iter(generator)
it2 = iter(generator)
I would expect these iterators to be different code objects, but it1 is it2 returns True... More confusingly, this is true for the following generators as well:
# copied data
gen = ((e, 2*e) for e in copy.deepcopy(data))
# temp object
gen = ((e, 2*e) for e in [1,2,3,4,5])
This means in practice that when I call next(it1), it2 is incremented as well, which is not the behavior I want.
What is going on here, and is there any way to do what I'm trying to do? I am using python 2.7 on Ubuntu 14.04.
Edit:
I just tried out the following as well:
gen = (e for e in [1,2,3,4,5])
it = iter(gen)
next(it)
next(it)
for e in gen:
print e
Which prints 3 4 5... Apparently generators are just a more constrained concept that I had imagined.
Generators are iterators. All well-behaved iterators have an __iter__ method that should simply
return self
From the docs
The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:
iterator.__iter__()Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements. This method corresponds to the tp_iter slot of the type structure for Python objects in the Python/C API.
iterator.__next__()Return the next item from the container. If there are no further items, raise the StopIteration exception. This method corresponds to the tp_iternext slot of the type structure for Python objects in the Python/C API.
So, consider another example of an iterator:
>>> x = [1, 2, 3, 4, 5]
>>> it = iter(x)
>>> it2 = iter(it)
>>> next(it)
1
>>> next(it2)
2
>>> it is it2
True
So, again, a list is iterable because it has an __iter__ method that returns an iterator. This iterator also has an __iter__ method, which should always return itself, but it also has a __next__ method.
So, consider:
>>> x = [1, 2, 3, 4, 5]
>>> it = iter(x)
>>> hasattr(x, '__iter__')
True
>>> hasattr(x, '__next__')
False
>>> hasattr(it, '__iter__')
True
>>> hasattr(it, '__next__')
True
>>> next(it)
1
>>> next(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
And for a generator:
>>> g = (x**2 for x in range(10))
>>> g
<generator object <genexpr> at 0x104104390>
>>> hasattr(g, '__iter__')
True
>>> hasattr(g, '__next__')
True
>>> next(g)
0
Now, you are using generator expressions. But you can just use a generator function. The most straightforward way to accomplish what you are doing is just to use:
def paired(data):
for e in data:
yield (e, 2*e)
Then use:
it1 = paired(data)
it2 = paired(data)
Which in this case, it1 and it2 will be two separate iterator objects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With