Can anybody tell me how the internal implementation of set and dict is different in python? Do they use the same data structure in the background?
++ In theory, one can use dict to achieve set functionality.
In CPython, sets and dicts use the same basic data structure. Sets tune it slightly differently, but it is basically a hash table just like dictionaries.
You can take a look at the implementation details in the C code: setobject.c and dictobject.c; the implementations are very close; the setobject.c implementation was started as a copy of dictobject.c originally. dictobject.c has more implementation notes and tracing calls, but the actual implementations of the core functions differs only in details.
The most obvious difference is that the keys in the hash table are not used to reference values, like in dictionaries, so a setentry struct only has a cached hash and key, the dictentry struct adds a value pointer.
Before we had the built-in set, we had the sets module, a pure-Python implementation that used dict objects to track the set values as keys. And in Python versions before the sets module was available, we did just that: use dict objects with keys as the set values, to track unique, unordered values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With