Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Python know somebody was looping over a dict?

Tags:

python

list

If somebody tries:

my_dict = {1: 1}
for key in my_dict:
    my_dict.pop(key)

one will get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

Python will throw an error, since you changed the size of the dict while looping over it.

How does Python know that this has happened, and can this feature be overrided programmatically so that the code runs?

And before somebody asks the inevitable question of "why I would want to do this": I don't. I'm asking a question. It's called curiosity.

FOR EXAMPLE:

Say I have a dict with 5 items. The above code should simply delete all items in the dict!

like image 444
michael Avatar asked Mar 02 '26 19:03

michael


2 Answers

If you search through the Python source code for "dictionary changed size during iteration", you'll find Objects/dictobject.c:

static PyObject*
dictiter_iternextkey(dictiterobject *di)
{
    /* ... omitted ... */

    if (di->di_used != d->ma_used) {
        PyErr_SetString(PyExc_RuntimeError,
                        "dictionary changed size during iteration");
        di->di_used = -1; /* Make this state sticky */
        return NULL;
    }

The ma_used field is simply the number of items in the dictionary, as documented in dictobject.h:

/* Number of items in the dictionary */
Py_ssize_t ma_used;

And di_used is simply a copy of that value from when the iterator was created.

You cannot change this programatically, at least, not in any reasonable way (let's not monkey-patch dict). You can create your own dictionary type, if you like, and define your own iterator that behaves differently.

The reason why Python does this is because it is hard to figure out what the "correct" thing to do is when you are iterating over a hash table that is changing.

Writing your own hash table implementation is a good exercise, and you'll quickly discover the problem... when you insert or remove entries in a hash table, it may change the order of other entries--is it acceptable for the iterator to skip entries, or return the same entry twice? Probably not. Can you create a data structure that provides the iteration behavior you want? Yes, but it's complicated, and the hash table that does this may perform worse under other scenarios.

like image 188
Dietrich Epp Avatar answered Mar 05 '26 08:03

Dietrich Epp


Python object can return their own iterators. See https://wiki.python.org/moin/Iterator. So when __iter__() is called on a dict object, the dict can set a flag that it is being iterated over. That same flag will be cleared once the internal iterator has consumed all the items in the dict. If any modifications are called on the dict (e.g using pop) the function checks the flag to see if the modification can be made or if the dict is still in an iterator loop.

like image 38
tantalum Avatar answered Mar 05 '26 09:03

tantalum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!