Modifying a dictionary while iterating over it. Bug in Python dict?

Question

The output of

d = {1: 1} for k in d.keys(): d['{}'.format(k)] = d.pop(k) print(d)

is {'1': 1}. The output of

d = {1: 1} for k in d.keys(): d['i{}'.format(k)] = d.pop(k) print(d)

is {'iiiii1': 1}. Is this a bug? I am running Python 3.6.1 :: Anaconda 4.4.0 (x86_64).

Why would that be a bug? You are removing and adding keys as you iterate. You are lucky you didn't end up in an endless loop. — Martijn Pieters, CommentedJun 26, 2017 at 16:00
Never alter a collection while iterating over it. Especially not dictionaries, since it can lead to weird behavior. — willeM_ Van Onsem, CommentedJun 26, 2017 at 16:01
@user2725109: how do you know that that first loop only ran once? For all you know it ran 1000 times. — Martijn Pieters, CommentedJun 26, 2017 at 16:06
@user2725109: the second can have run 1000s of times: simply replacing the key for an identical one until finally the iteration ends. — willeM_ Van Onsem, CommentedJun 26, 2017 at 16:09
Please don't change the question when there are answers. It invalidates the answer — Baldrickk, CommentedJun 27, 2017 at 8:15

Martijn Pieters · Accepted Answer · 2017-06-26 17:43:39Z

No, this is not a bug. This is in fact explicitly documented:

Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond.
[...]
Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.

Bold emphasis mine.

You are iterating over the keys, while at the same time both adding and deleting entries in the dictionary. That worked for a few iterations, and then you hit a fail to iterate over all entries case and iteration stopped.

What happens is that you trigger a re-size at 6 additions, and that causes iteration to fail at that point; the 'next' key is now slotted in an 'earlier' slot. This happens for both tests, you just don't realise it iterates 5 times in both cases:

>>> d = {1: 1} >>> for i, k in enumerate(d): ... print(i) ... d['{}'.format(k)] = d.pop(k) ... 0 1 2 3 4 >>> d = {1: 1} >>> for i, k in enumerate(d): ... print(i) ... d['i{}'.format(k)] = d.pop(k) ... 0 1 2 3 4

It is run 5 times because the current dict implementation starts with a hash table of size 8, and a resize is triggered when the table is 2/3s full (your initial dict has 1 entry, 5 insertions make it > (8 * 2/3 == 5.333333). The table is getting filled with DKIX_DUMMY entities, entered when you delete a key (needed to handle hash collisions correctly).

Note that this is all highly implementation dependent. In Python 3.5 and before, both snippets iterate just once (even if you use for k in d: to avoid creating a list object for the keys); iteration continues in 3.6 because the implementation changed and iteration now follows insertion order. Future Python versions are free to alter the implementation again.

@SorousHBakhtiary: that's not new. I explicitly mention that this happens in my answer, and why. Note that the new implementation (which preserves insertion order) was new in 3.6, covered at the end of my answer here. — Martijn Pieters, CommentedFeb 14, 2022 at 18:40
@SorousHBakhtiary: Something else changed in Python 3.8, however, and the code runs just once in that version and up. It was probably adding support for __reversed__ that did that. — Martijn Pieters, CommentedFeb 14, 2022 at 18:46

Collectives™ on Stack Overflow

Modifying a dictionary while iterating over it. Bug in Python dict?

1 Answer 1

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Linked

Related