How to determine the required maxsize of lru_cache?

Question

If we are creating a recursive function like one that would return the Fibonacci sequence, and using the lru_cache .. What is the real governor of the max size parameter?

It is obvious that we only need the last two items when calculating each term.. but setting the maxsize to 2 and running the first 1000 calculation will take ages to finish.

I tried to use a cache dictionary that only contains two elements:

fib_cache = {0: 1, 1: 1}

def fib(n):
    if n == 1:
        val = 1
    elif n == 2:
        val = 1
    elif n > 2:
        val = fib_cache[0] + fib_cache[1]

    fib_cache[0] = fib_cache[1]
    fib_cache[1] = val
    return val

Then, I run a similar function with lru_cache:

from functools import lru_cache

@lru_cache(maxsize=3)
def fib1(n):
    if n == 1:
        val = 1
    elif n == 2:
        val = 1
    elif n > 2:
        val = fib1(n - 1) + fib1(n - 2)
    return val

I called the first 1000 calculation of each and the results are identical in terms of performance. However, I am not sure how to specify the maxsize parameter. I just found that, for this specific function, 2 would take ages and 3 works just fine. My guess is that it will be storing the result, here fib1(n), together with the last two items that were used to calculate it, fib1(n - 1) and fib1(n - 2), but why won't the result replace the oldest item immediately? Does fib1(n) take place in the cache memory before even being calculated? Is there a way to view the lru_cache element? Maybe this would be helpful.

mbh86 · Accepted Answer

You are right, only caching 2 values is enough for Fibonacci computation.

Your function doesn't work properly because recursive calls are not set in the good order. Add some print statements to your function and you will understand its behaviour.

from functools import lru_cache

@lru_cache(maxsize=2)
def fib(n):
    print(f'calling fib({n})')
    if n == 1:
        val = 1
    elif n == 2:
        val = 1
    elif n > 2:
        val = fib(n - 1) + fib(n - 2)
    print(f'fib({n}) is being computed')
    return val

fib(5)

# calling fib(5)
# calling fib(4)
# calling fib(3)
# calling fib(2)
# fib(2) is being computed
# calling fib(1)
# fib(1) is being computed
# fib(3) is being computed
# calling fib(2)
# fib(2) is being computed
# fib(4) is being computed
# calling fib(3)
# calling fib(1)
# fib(1) is being computed
# fib(3) is being computed
# fib(5) is being computed

What's happening here is when you compute from fib(4), it needs fib(3) and fib(2). But fib(3) needed fib(2) and then fib(1), so the 2 last calls were fib(3) and fib(1) so you need again to recompute fib(2).

So you should switch fib(n - 1) and fib(n - 2) to make it work:

@lru_cache(maxsize=2)
def fib(n):
    if n == 1:
        val = 1
    elif n == 2:
        val = 1
    elif n > 2:
        val = fib(n - 2) + fib(n - 1)  
    return val

How to determine the required maxsize of lru_cache?

Tags:

python

caching

memoization

Aziz

1 Answers

mbh86

Recent Activity

Donate For Us

How to determine the required maxsize of lru_cache?

Tags:

python

caching

memoization

Aziz

1 Answers

mbh86

Related questions

Recent Activity

Donate For Us