Ensuring __init__ is only called once when class instance is created by constructor or __new__

Question

I'm trying to understand how new instances of a Python class should be created when the creation process can either be via the constructor or via the __new__ method. In particular, I notice that when using the constructor, the __init__ method will be automatically called after __new__, while when invoking __new__ directly the __init__ class will not automatically be called. I can force __init__ to be called when __new__ is explicitly called by embedding a call to __init__ within __new__, but then __init__ will end up getting called twice when the class is created via the constructor.

For example, consider the following toy class, which stores one internal property, namely a list object called data: it is useful to think of this as the start of a vector class.

class MyClass(object):
    def __new__(cls, *args, **kwargs):
        obj = object.__new__(cls, *args, **kwargs)
        obj.__init__(*args, **kwargs)
        return obj

    def __init__(self, data):
        self.data = data

    def __getitem__(self, index):
        return self.__new__(type(self), self.data[index])

    def __repr__(self):
        return repr(self.data)

A new instance of the class can be created either using the constructor (not actually sure if that is the right terminology in Python), something like

x = MyClass(range(10))

or via slicing, which you can see invokes a call to __new__ in the __getitem__ method.

x2 = x[0:2]

In the first instance, __init__ will be called twice (both via the explicit call within __new__ and then again automatically), and once in the second instance. Obviously I would only like __init__ to be invoked once in any case. Is there a standard way to do this in Python?

Note that in my example I could get rid of the __new__ method and redefine __getitem__ as

def __getitem__(self, index):
    return MyClass(self.data[index])

but then this would cause a problem if I later want to inherit from MyClass, because if I make a call like child_instance[0:2] I will get back an instance of MyClass, not the child class.

unutbu · Accepted Answer

First, some basic facts about __new__ and __init__:

__new__ is a constructor.
__new__ typically returns an instance of cls, its first argument.
By __new__ returning an instance of cls, __new__ causes Python to call __init__.
__init__ is an initializer. It modifies the instance (self) returned by __new__. It does not need to return self.

When MyClass defines:

def __new__(cls, *args, **kwargs):
    obj = object.__new__(cls, *args, **kwargs)
    obj.__init__(*args, **kwargs)
    return obj

MyClass.__init__ gets called twice. Once from calling obj.__init__ explicitly, and a second time because __new__ returned obj, an instance of cls. (Since the first argument to object.__new__ is cls, the instance returned is an instance of MyClass so obj.__init__ calls MyClass.__init__, not object.__init__.)

The Python 2.2.3 release notes has an interesting comment, which sheds light on when to use __new__ and when to use __init__:

The __new__ method is called with the class as its first argument; its responsibility is to return a new instance of that class.

Compare this to __init__:__init__ is called with an instance as its first argument, and it doesn't return anything; its responsibility is to initialize the instance.

All this is done so that immutable types can preserve their immutability while allowing subclassing.

The immutable types (int, long, float, complex, str, unicode, and tuple) have a dummy __init__, while the mutable types (dict, list, file, and also super, classmethod, staticmethod, and property) have a dummy __new__.

So, use __new__ to define immutable types, and use __init__ to define mutable types. While it is possible to define both, you should not need to do so.

Thus, since MyClass is mutable, you should only define __init__:

class MyClass(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, index):
        return type(self)(self.data[index])

    def __repr__(self):
        return repr(self.data)

x = MyClass(range(10))
x2 = x[0:2]

Ensuring init is only called once when class instance is created by constructor or new

Tags:

python

Abiel

1 Answers

unutbu

Recent Activity

Donate For Us