I noticed that on my machine, the following reaches the max recursion depth for n = 2960:
m = {0:0, 1:1}
def f(n):
if n not in m:
m[n] = f(n - 1) + f(n - 2)
return m[n]
while this version reaches it for n = 988:
m = {0:0, 1:1}
def f(n):
if n not in m:
m[n] = sum(f(n - i) for i in [1, 2])
return m[n]
Can anyone explain what happens "under the hood" that explains such difference?
More precisely, it would be very nice to understand why the factor is 3 in this example and be able to derive it for summations with more terms.
TL;DR: sum
is an extra function call, and the generator expression it is summing over is also implemented with a nested function scope (docs)
... the comprehension is executed in a separate implicitly nested scope. This ensures that names assigned to in the target list don’t “leak” into the enclosing scope.
Therefore the second way uses two extra stack frames during the recursion, as well as the recursive call itself, which is explaining the factor of 3 here.
Note that the default recursion limit is 1000, so you should really be seeing the stack overflow at exactly 1000 for the first case, and at 334 for the second case (on Python 3.10 or lower). To get 2960 and 988 here, you may have:
sys.setrecursionlimit(3000)
, andRunning this code within an IDE or a tricked-out interactive REPL, for example, will likely have done both of the above.
There is PEP 709 – Inlined comprehensions which is all about removing this hidden function from comprehensions. Generator expressions are currently not inlined in the reference implementation for the PEP, although it may be considered in future.
CPython 3.11 already has some optimizations to reduce the number of function calls in this code, which changes the point of stack overflow. It will happen at 501 instead of 334 in CPython 3.11. The reason is described in the changelog at What’s New In Python 3.11: Inlined Python function calls.
I'll add my debugging here as well. After running some tests, I found that the traceback
module was giving me a call stack that was significantly (~333) less than the recursion limit. I noticed that this difference was always close to the number of <genexpr>
invocations on the call stack:
import sys
import traceback
from collections import Counter
from pprint import pprint
sum_every_n = int(sys.argv[1])
def f(n=0):
try:
if n % sum_every_n == 0:
return sum(f(n+i) for i in [1,2])
else:
return f(n+1) + f(n+2)
except RecursionError:
stack = traceback.extract_stack()
counts = Counter([frame.name for frame in stack])
stack_size = len(stack)
stack_size_plus = stack_size + counts['<genexpr>']
pprint(counts)
print(f'rec limit: {sys.getrecursionlimit()}')
print(f'stack size: {stack_size}')
print(f'adj stack size: {stack_size_plus}')
sys.exit()
f()
Here are the results for some values of sum_every_n
:
$ ./rec3.py 1
Counter({'f': 333, '<genexpr>': 332, '<module>': 1})
rec limit: 1000
stack size: 666
adj stack size: 998
$ ./rec3.py 2
Counter({'f': 499, '<genexpr>': 249, '<module>': 1})
rec limit: 1000
stack size: 749
adj stack size: 998
$ ./rec3.py 3
Counter({'f': 599, '<genexpr>': 200, '<module>': 1})
rec limit: 1000
stack size: 800
adj stack size: 1000
$ ./rec3.py 4
Counter({'f': 665, '<genexpr>': 166, '<module>': 1})
rec limit: 1000
stack size: 832
adj stack size: 998
It seems that the generator is indeed adding a function call to the stack, but also that each generator expression counts as two functions on the call stack. This explains the ratio in your original question. Not sure why this is the case though, and I welcome any possible explanations!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With