I'm trying to understand how new instances of a Python class should be created when the creation process can either be via the constructor or via the __new__ method. In particular, I notice that when using the constructor, the __init__ method will be automatically called after __new__, while when invoking __new__ directly the __init__ class will not automatically be called. I can force __init__ to be called when __new__ is explicitly called by embedding a call to __init__ within __new__, but then __init__ will end up getting called twice when the class is created via the constructor.
For example, consider the following toy class, which stores one internal property, namely a list object called data: it is useful to think of this as the start of a vector class.
class MyClass(object):
def __new__(cls, *args, **kwargs):
obj = object.__new__(cls, *args, **kwargs)
obj.__init__(*args, **kwargs)
return obj
def __init__(self, data):
self.data = data
def __getitem__(self, index):
return self.__new__(type(self), self.data[index])
def __repr__(self):
return repr(self.data)
A new instance of the class can be created either using the constructor (not actually sure if that is the right terminology in Python), something like
x = MyClass(range(10))
or via slicing, which you can see invokes a call to __new__ in the __getitem__ method.
x2 = x[0:2]
In the first instance, __init__ will be called twice (both via the explicit call within __new__ and then again automatically), and once in the second instance. Obviously I would only like __init__ to be invoked once in any case. Is there a standard way to do this in Python?
Note that in my example I could get rid of the __new__ method and redefine __getitem__ as
def __getitem__(self, index):
return MyClass(self.data[index])
but then this would cause a problem if I later want to inherit from MyClass, because if I make a call like child_instance[0:2] I will get back an instance of MyClass, not the child class.
First, some basic facts about __new__ and __init__:
__new__ is a constructor.__new__ typically returns an instance of cls, its first argument. __new__ returning an instance of cls, __new__ causes Python to call __init__.__init__ is an initializer. It modifies the instance (self)
returned by __new__. It does not need to return self.When MyClass defines:
def __new__(cls, *args, **kwargs):
obj = object.__new__(cls, *args, **kwargs)
obj.__init__(*args, **kwargs)
return obj
MyClass.__init__ gets called twice. Once from calling obj.__init__ explicitly, and a second time because __new__ returned obj, an instance of cls. (Since the first argument to object.__new__ is cls, the instance returned is an instance of MyClass so obj.__init__ calls MyClass.__init__, not object.__init__.)
The Python 2.2.3 release notes has an interesting comment, which sheds light on when to use __new__ and when to use __init__:
The
__new__method is called with the class as its first argument; its responsibility is to return a new instance of that class.Compare this to
__init__:__init__is called with an instance as its first argument, and it doesn't return anything; its responsibility is to initialize the instance.All this is done so that immutable types can preserve their immutability while allowing subclassing.
The immutable types (int, long, float, complex, str, unicode, and tuple) have a dummy
__init__, while the mutable types (dict, list, file, and also super, classmethod, staticmethod, and property) have a dummy__new__.
So, use __new__ to define immutable types, and use __init__ to define mutable types. While it is possible to define both, you should not need to do so.
Thus, since MyClass is mutable, you should only define __init__:
class MyClass(object):
def __init__(self, data):
self.data = data
def __getitem__(self, index):
return type(self)(self.data[index])
def __repr__(self):
return repr(self.data)
x = MyClass(range(10))
x2 = x[0:2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With