What's the difference?
SomeBaseClass.__init__(self)
means to call SomeBaseClass
's __init__
. while
super().__init__()
means to call a bound __init__
from the parent class that follows SomeBaseClass
's child class (the one that defines this method) in the instance's Method Resolution Order (MRO).
If the instance is a subclass of this child class, there may be a different parent that comes next in the MRO.
Explained simply
When you write a class, you want other classes to be able to use it. super()
makes it easier for other classes to use the class you're writing.
As Bob Martin says, a good architecture allows you to postpone decision making as long as possible.
super()
can enable that sort of architecture.
When another class subclasses the class you wrote, it could also be inheriting from other classes. And those classes could have an __init__
that comes after this __init__
based on the ordering of the classes for method resolution.
Without super
you would likely hard-code the parent of the class you're writing (like the example does). This would mean that you would not call the next __init__
in the MRO, and you would thus not get to reuse the code in it.
If you're writing your own code for personal use, you may not care about this distinction. But if you want others to use your code, using super
is one thing that allows greater flexibility for users of the code.
Python 2 versus 3
This works in Python 2 and 3:
super(Child, self).__init__()
This only works in Python 3:
super().__init__()
It works with no arguments by moving up in the stack frame and getting the first argument to the method (usually self
for an instance method or cls
for a class method - but could be other names) and finding the class (e.g. Child
) in the free variables (it is looked up with the name __class__
as a free closure variable in the method).
I used to prefer to demonstrate the cross-compatible way of using super
, but now that Python 2 is largely deprecated, I will demonstrate the Python 3 way of doing things, that is, calling super
with no arguments.
Indirection with Forward Compatibility
What does it give you? For single inheritance, the examples from the question are practically identical from a static analysis point of view. However, using super
gives you a layer of indirection with forward compatibility.
Forward compatibility is very important to seasoned developers. You want your code to keep working with minimal changes as you change it. When you look at your revision history, you want to see precisely what changed when.
You may start off with single inheritance, but if you decide to add another base class, you only have to change the line with the bases - if the bases change in a class you inherit from (say a mixin is added) you'd change nothing in this class.
In Python 2, getting the arguments to super
and the correct method arguments right can be a little confusing, so I suggest using the Python 3 only method of calling it.
If you know you're using super
correctly with single inheritance, that makes debugging less difficult going forward.
Dependency Injection
Other people can use your code and inject parents into the method resolution:
class SomeBaseClass(object):
def __init__(self):
print('SomeBaseClass.__init__(self) called')
class UnsuperChild(SomeBaseClass):
def __init__(self):
print('UnsuperChild.__init__(self) called')
SomeBaseClass.__init__(self)
class SuperChild(SomeBaseClass):
def __init__(self):
print('SuperChild.__init__(self) called')
super().__init__()
Say you add another class to your object, and want to inject a class between Foo and Bar (for testing or some other reason):
class InjectMe(SomeBaseClass):
def __init__(self):
print('InjectMe.__init__(self) called')
super().__init__()
class UnsuperInjector(UnsuperChild, InjectMe): pass
class SuperInjector(SuperChild, InjectMe): pass
Using the un-super child fails to inject the dependency because the child you're using has hard-coded the method to be called after its own:
>>> o = UnsuperInjector()
UnsuperChild.__init__(self) called
SomeBaseClass.__init__(self) called
However, the class with the child that uses super
can correctly inject the dependency:
>>> o2 = SuperInjector()
SuperChild.__init__(self) called
InjectMe.__init__(self) called
SomeBaseClass.__init__(self) called
Addressing a comment
Why in the world would this be useful?
Python linearizes a complicated inheritance tree via the C3 linearization algorithm to create a Method Resolution Order (MRO).
We want methods to be looked up in that order.
For a method defined in a parent to find the next one in that order without super
, it would have to
- get the mro from the instance's type
- look for the type that defines the method
- find the next type with the method
- bind that method and call it with the expected arguments
The UnsuperChild
should not have access to InjectMe
. Why isn't the conclusion "Always avoid using super
"? What am I missing here?
The UnsuperChild
does not have access to InjectMe
. It is the UnsuperInjector
that has access to InjectMe
- and yet cannot call that class's method from the method it inherits from UnsuperChild
.
Both Child classes intend to call a method by the same name that comes next in the MRO, which might be another class it was not aware of when it was created.
The one without super
hard-codes its parent's method - thus is has restricted the behavior of its method, and subclasses cannot inject functionality in the call chain.
The one with super
has greater flexibility. The call chain for the methods can be intercepted and functionality injected.
You may not need that functionality, but subclassers of your code may.
Conclusion
Always use super
to reference the parent class instead of hard-coding it.
What you intend is to reference the parent class that is next-in-line, not specifically the one you see the child inheriting from.
Not using super
can put unnecessary constraints on users of your code.
I just read about this topic in The Well-Grounded Rubyist (great book, by the way). The author does a better job of explaining than I would so I'll quote him:
No single rule or formula always results in the right design. But it’s useful to keep a
couple of considerations in mind when you’re making class-versus-module decisions:
Modules don’t have instances. It follows that entities or things are generally best
modeled in classes, and characteristics or properties of entities or things are
best encapsulated in modules. Correspondingly, as noted in section 4.1.1, class
names tend to be nouns, whereas module names are often adjectives (Stack
versus Stacklike).
A class can have only one superclass, but it can mix in as many modules as it wants. If
you’re using inheritance, give priority to creating a sensible superclass/subclass
relationship. Don’t use up a class’s one and only superclass relationship to
endow the class with what might turn out to be just one of several sets of characteristics.
Summing up these rules in one example, here is what you should not do:
module Vehicle
...
class SelfPropelling
...
class Truck < SelfPropelling
include Vehicle
...
Rather, you should do this:
module SelfPropelling
...
class Vehicle
include SelfPropelling
...
class Truck < Vehicle
...
The second version models the entities and properties much more neatly. Truck
descends from Vehicle (which makes sense), whereas SelfPropelling is a characteristic of vehicles (at least, all those we care about in this model of the world)—a characteristic that is passed on to trucks by virtue of Truck being a descendant, or specialized
form, of Vehicle.
Best Solution
There's a fundamental difference between those two methods that all the other answers are missing, and that's rails' implementation of STIs (Single Table Inheritance):
http://api.rubyonrails.org/classes/ActiveRecord/Base.html (Find the "Single Table Inheritance" section)
Basically, if you refactor your Base class like this:
Then, you are supposed to have a database table called "bases", with a column called "type", which should have a value of "A" or "B". The columns on this table will be the same across all your models, and if you have a column that belongs to only one of the models, your "bases" table will be denormalized.
Whereas, if you refactor your Base class like this:
Then there will be no table "bases". Instead, there will be a table "as" and a table "bs". If they have the same attributes, the columns will have to be duplicated across both tables, but if there are differences, they won't be denomarlized.
So, if one is preferable over the other, yes, but that's specific to your application. As a rule of thumb, if they have the exact same properties or a big overlap, use STI (1st example), else, use Modules (2nd example).