I'm trying to understand how new instances of a Python class should be created when the creation process can either be via the constructor or via the
__new__ method. In particular, I notice that when using the constructor, the
__init__ method will be automatically called after
__new__, while when invoking
__new__ directly the
__init__ class will not automatically be called. I can force
__init__ to be called when
__new__ is explicitly called by embedding a call to
__new__, but then
__init__ will end up getting called twice when the class is created via the constructor.
For example, consider the following toy class, which stores one internal property, namely a
list object called
data: it is useful to think of this as the start of a vector class.
class MyClass(object): def __new__(cls, *args, **kwargs): obj = object.__new__(cls, *args, **kwargs) obj.__init__(*args, **kwargs) return obj def __init__(self, data): self.data = data def __getitem__(self, index): return self.__new__(type(self), self.data[index]) def __repr__(self): return repr(self.data)
A new instance of the class can be created either using the constructor (not actually sure if that is the right terminology in Python), something like
x = MyClass(range(10))
or via slicing, which you can see invokes a call to
__new__ in the
x2 = x[0:2]
In the first instance,
__init__ will be called twice (both via the explicit call within
__new__ and then again automatically), and once in the second instance. Obviously I would only like
__init__ to be invoked once in any case. Is there a standard way to do this in Python?
Note that in my example I could get rid of the
__new__ method and redefine
def __getitem__(self, index): return MyClass(self.data[index])
but then this would cause a problem if I later want to inherit from
MyClass, because if I make a call like
child_instance[0:2] I will get back an instance of
MyClass, not the child class.
First, some basic facts about
__new__is a constructor.
__new__typically returns an instance of
cls, its first argument.
__new__returning an instance of
__new__causes Python to call
__init__is an initializer. It modifies the instance (
self) returned by
__new__. It does not need to return
def __new__(cls, *args, **kwargs): obj = object.__new__(cls, *args, **kwargs) obj.__init__(*args, **kwargs) return obj
MyClass.__init__ gets called twice. Once from calling
obj.__init__ explicitly, and a second time because
obj, an instance of
cls. (Since the first argument to
cls, the instance returned is an instance of
The Python 2.2.3 release notes has an interesting comment, which sheds light on when to use
__new__ and when to use
__new__method is called with the class as its first argument; its responsibility is to return a new instance of that class.
Compare this to
__init__is called with an instance as its first argument, and it doesn't return anything; its responsibility is to initialize the instance.
All this is done so that immutable types can preserve their immutability while allowing subclassing.
The immutable types (int, long, float, complex, str, unicode, and tuple) have a dummy
__init__, while the mutable types (dict, list, file, and also super, classmethod, staticmethod, and property) have a dummy
__new__ to define immutable types, and use
__init__ to define mutable types. While it is possible to define both, you should not need to do so.
Thus, since MyClass is mutable, you should only define
class MyClass(object): def __init__(self, data): self.data = data def __getitem__(self, index): return type(self)(self.data[index]) def __repr__(self): return repr(self.data) x = MyClass(range(10)) x2 = x[0:2]