If somebody tries:
my_dict = {1: 1}
for key in my_dict:
my_dict.pop(key)
one will get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
Python will throw an error, since you changed the size of the dict while looping over it.
How does Python know that this has happened, and can this feature be overrided programmatically so that the code runs?
And before somebody asks the inevitable question of "why I would want to do this": I don't. I'm asking a question. It's called curiosity.
FOR EXAMPLE:
Say I have a dict with 5 items. The above code should simply delete all items in the dict!
If you search through the Python source code for "dictionary changed size during iteration", you'll find Objects/dictobject.c:
static PyObject*
dictiter_iternextkey(dictiterobject *di)
{
/* ... omitted ... */
if (di->di_used != d->ma_used) {
PyErr_SetString(PyExc_RuntimeError,
"dictionary changed size during iteration");
di->di_used = -1; /* Make this state sticky */
return NULL;
}
The ma_used field is simply the number of items in the dictionary, as documented in dictobject.h:
/* Number of items in the dictionary */
Py_ssize_t ma_used;
And di_used is simply a copy of that value from when the iterator was created.
You cannot change this programatically, at least, not in any reasonable way (let's not monkey-patch dict). You can create your own dictionary type, if you like, and define your own iterator that behaves differently.
The reason why Python does this is because it is hard to figure out what the "correct" thing to do is when you are iterating over a hash table that is changing.
Writing your own hash table implementation is a good exercise, and you'll quickly discover the problem... when you insert or remove entries in a hash table, it may change the order of other entries--is it acceptable for the iterator to skip entries, or return the same entry twice? Probably not. Can you create a data structure that provides the iteration behavior you want? Yes, but it's complicated, and the hash table that does this may perform worse under other scenarios.
Python object can return their own iterators. See https://wiki.python.org/moin/Iterator. So when __iter__() is called on a dict object, the dict can set a flag that it is being iterated over. That same flag will be cleared once the internal iterator has consumed all the items in the dict. If any modifications are called on the dict (e.g using pop) the function checks the flag to see if the modification can be made or if the dict is still in an iterator loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With