Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does creating a variable name for an exception raised in a Python function affect the reference count of an input variable to that function?

I've defined two simple Python functions that take a single argument, raise an exception, and handle the raised exception. One function uses a variable to refer to the exception before raising/handling, the other does not:

def refcount_unchanged(x):
    try:
        raise Exception()
    except:
        pass

def refcount_increases(x):
    e = Exception()
    try:
        raise e
    except:
        pass

One of the resulting functions increases pythons refcount for its input argument, the other does not:

import sys

a = []
print(sys.getrefcount(a))
for i in range(3):
    refcount_unchanged(a)
    print(sys.getrefcount(a))
# prints: 2, 2, 2, 2

b = []
print(sys.getrefcount(b))
for i in range(3):
    refcount_increases(b)
    print(sys.getrefcount(b))
# prints: 2, 3, 4, 5

Can anyone explain why this happens?

like image 574
Andrew Avatar asked Oct 29 '25 13:10

Andrew


2 Answers

It is a side effect of the "exception -> traceback -> stack frame -> exception" reference cycle from the __traceback__ attribute on exception instances introduced in PEP-344 (Python 2.5), and resolved in cases like refcount_unchanged in PEP-3110 (Python 3.0).

In refcount_increases, the reference cycle can be observed by printing this:

except:
    print(e.__traceback__.tb_frame.f_locals)  # {'x': [], 'e': Exception()}

which shows that x is also referenced in the frame's locals.

The reference cycle is resolved when the garbage collector runs, or if gc.collect() is called.

In refcount_unchanged, as per PEP-3110's Semantic Changes, Python 3 generates additional bytecode to delete the target, thus eliminating the reference cycle:

def refcount_unchanged(x):
    try:
        raise Exception()
    except:
        pass

gets translated to something like:

def refcount_unchanged(x):
    try:
        raise Exception()
    except Exception as e:
        try:
            pass
        finally:
            e = None
            del e

Resolving the reference cycle in refcount_increases

While not necessary (since the garbage collector will do its job), you can do something similar in refcount_increases by manually deleting the variable reference:

def refcount_increases(x):
    e = Exception()
    try:
        raise e
    except:
        pass
    finally:   # +
        del e  # +

Alternatively, you can overwrite the variable reference and let the implicit deletion work:

def refcount_increases(x):
    e = Exception()
    try:
        raise e
    # except:               # -
    except Exception as e:  # +
        pass

A little more about the reference cycle

The exception e and other local variables are actually referenced directly by e.__traceback__.tb_frame, presumably in C code.

This can be observed by printing this:

print(sys.getrefcount(b))
print(gc.get_referrers(b)[0])  # <frame at ...>

Accessing e.__traceback__.tb_frame.f_locals creates a dictionary cached on the frame (another reference cycle) and thwarts the proactive resolutions above.

print(sys.getrefcount(b))
print(gc.get_referrers(b)[0])  # {'x': [], 'e': Exception()}

However, this reference cycle will also be handled by the garbage collector.

like image 147
aaron Avatar answered Oct 31 '25 03:10

aaron


It seems that writing out the question helped us realize part of the answer. If we garbage-collect after each call to refcount_increases, the refcount no longer increases. Interesting! I don't think this is a complete answer to our question, but it's certainly suggestive. Any further information would be welcome.

import gc
c = []
print(sys.getrefcount(c))
for i in range(3):
    refcount_increases(c)
    gc.collect()
    print(sys.getrefcount(c))
# prints: 2, 2, 2, 2
like image 44
Andrew Avatar answered Oct 31 '25 04:10

Andrew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!