I'm seeing some really odd behavior that I am not sure how to explain, when dynamically nesting generator expressions in Python 3, when the generator expression references a function which is dynamically referenced.
Here is a very simplified case reproducing the problem:
double = lambda x: x * 2
triple = lambda x: x * 3
processors = [double, triple]
data = range(3)
for proc in processors:
data = (proc(i) for i in data)
result = list(data)
print(result)
assert result == [0, 6, 12]
In this case, I expected each number to be multiplied by 6 (triple(double(x))) but in reality triple(triple(x)) is called. It's more or less clear to me that proc points to triple when the generator expression is run, regardless of what it pointed to when the generator expression was created.
So, (1) is this expected and can someone point to some relevant info in the Python docs or elsewhere explaining this?
and (2) Can you recommend another method of nesting generator expressions, where each level calls a dynamically provided callable?
EDIT: I am seeing it on Python 3.8.x, haven't tested with other versions
This is a result of two things:
So at the time you consume the generator with list(data), the name proc refers to the function triple, and both generators call the function bound by the name proc, so you get triple twice.
The reason map works is because it's a function, so when you pass proc as an argument, it receives the value of proc at the time map is called, which is in the loop while proc still can refer to the double function.
Yes, it's expected, and you got the reason right.
As generators are lazy, proc(i) gets evaluated only when requested. Which involves evaluating proc and i then. And when you finally do request, proc is already triple, so that's what gets used.
In this particular case, data = map(proc, data) does the job. It works because map captures and remembers the proc as it was when you called map.
You could do the same with a generator function. I tried with a generator expression like
data = (p(i) for p in [proc] for i in data)
but it failed with ValueError: generator already executing. This worked, though:
data = (lambda proc: (proc(i) for i in data))(proc)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With