Previous Page
Next Page

5.1. Classes and Instances

If you're already familiar with object-oriented programming in other languages such as C++ or Java, then you probably have a good intuitive grasp of classes and instances: a class is a user-defined type, which you can instantiate to obtain instances, meaning objects of that type. Python supports these concepts through its class and instance objects.

5.1.1. Python Classes

A class is a Python object with several characteristics:

  • You can call a class object as if it were a function. The call returns another object, known as an instance of the class; the class is also known as the type of the instance.

  • A class has arbitrarily named attributes that you can bind and reference.

  • The values of class attributes can be descriptors (including functions), covered in "Descriptors" on page 85, or normal data objects.

  • Class attributes bound to functions are also known as methods of the class.

  • A method can have a special Python-defined name with two leading and two trailing underscores. Python implicitly invokes such special methods, if a class supplies them, when various kinds of operations take place on instances of that class.

  • A class can inherit from other classes, meaning it delegates to other class objects the lookup of attributes that are not found in the class itself.

An instance of a class is a Python object with arbitrarily named attributes that you can bind and reference. An instance object implicitly delegates to its class the lookup of attributes not found in the instance itself. The class, in turn, may delegate the lookup to the classes from which it inherits, if any.

In Python, classes are objects (values) and are handled like other objects. Thus, you can pass a class as an argument in a call to a function. Similarly, a function can return a class as the result of a call. A class, just like any other object, can be bound to a variable (local or global), an item in a container, or an attribute of an object. Classes can also be keys into a dictionary. The fact that classes are ordinary objects in Python is often expressed by saying that classes are first-class objects.

5.1.2. The class Statement

The class statement is the most common way to create a class object. class is a single-clause compound statement with the following syntax:

class classname(base-classes):
    statement(s)

classname is an identifier. It is a variable that gets bound (or rebound) to the class object after the class statement finishes executing.

base-classes is a comma-delimited series of expressions whose values must be class objects. These classes are known by different names in different programming languages; you can, depending on your choice, call them the bases, superclasses, or parents of the class being created. The class being created can be said to inherit from, derive from, extend, or subclass its base classes, depending on what programming language you are familiar with. This class is also known as a direct subclass or descendant of its base classes.

Syntactically, base-classes is optional: to indicate that you're creating a class without bases, you can omit base-classes (and the parentheses around it), placing the colon right after the classname (in Python 2.5, you may also use empty parentheses between the classname and the colon, with the same meaning). However, a class without bases, for reasons of backward compatibility, is an old-style one (unless you define the _ _metaclass_ _ attribute, covered in "How Python Determines a Class's Metaclass" on page 117). To create a new-style class C without any "true" bases, code class C(object):; since every type subclasses the built-in object, specifying object as the value of base-classes just means that class C is new-style rather than old-style. If your class has ancestors that are all old-style and does not define the _ _metaclass_ _ attribute, then your class is old-style; otherwise, a class with bases is always new-style (even if some bases are new-style and some are old-style).

The subclass relationship between classes is transitive: if C1 subclasses C2, and C2 subclasses C3, then C1 subclasses C3. Built-in function issubclass(C1, C2) accepts two arguments that are class objects: it returns true if C1 subclasses C2; otherwise, it returns False. Any class is considered a subclass of itself; therefore, issubclass(C, C) returns TRue for any class C. The way in which the base classes of a class affect the functionality of the class is covered in "Inheritance" on page 94.

The nonempty sequence of statements that follows the class statement is known as the class body. A class body executes immediately as part of the class statement's execution. Until the body finishes executing, the new class object does not yet exist and the classname identifier is not yet bound (or rebound). "How a Metaclass Creates a Class" on page 118 provides more details about what happens when a class statement executes.

Finally, note that the class statement does not immediately create any instance of the new class but rather defines the set of attributes that will be shared by all instances when you later create instances by calling the class.

5.1.3. The Class Body

The body of a class is where you normally specify the attributes of the class; these attributes can be descriptor objects (including functions) or normal data objects of any type (an attribute of a class can also be another classso, for example, you can have a class statement "nested" inside another class statement).

5.1.3.1. Attributes of class objects

You normally specify an attribute of a class object by binding a value to an identifier within the class body. For example:

class C1(object):
    x = 23
print C1.x                               # prints: 23

Class object C1 has an attribute named x, bound to the value 23, and C1.x refers to that attribute.

You can also bind or unbind class attributes outside the class body. For example:

class C2(object): pass C2.x = 23
print C2.x                               # prints: 23

However, your program is more readable if you bind, and thus create, class attributes with statements inside the class body. Any class attributes are implicitly shared by all instances of the class when those instances are created, as we'll discuss shortly.

The class statement implicitly sets some class attributes. Attribute _ _name_ _ is the classname identifier string used in the class statement. Attribute _ _bases_ _ is the tuple of class objects given as the base classes in the class statement. For example, using the class C1 we just created:

print C1._ _name_ _, C1._ _bases_ _         # prints: C1, (<type 'object'>,)

A class also has an attribute _ _dict_ _, which is the dictionary object that the class uses to hold all of its other attributes. For any class object C, any object x, and any identifier S (except _ _name_ _, _ _bases_ _, and _ _dict_ _), C.S=x is equivalent to C._ _dict_ _['S']=x. For example, again referring to the class C1 we just created:

C1.y = 45
C1._ _dict_ _['z'] = 67
print C1.x, C1.y, C1.z                   # prints: 23, 45, 67

There is no difference between class attributes created in the class body, outside the body by assigning an attribute, or outside the body by explicitly binding an entry in C._ _dict_ _.

In statements that are directly in a class's body, references to attributes of the class must use a simple name, not a fully qualified name. For example:

class C3(object):
    x = 23
    y = x + 22                         # must use just x, not C3.x

However, in statements that are in methods defined in a class body, references to attributes of the class must use a fully qualified name, not a simple name. For example:

class C4(object):
    x = 23
    def amethod(self):
        print C4.x                     # must use C4.x, not just x

Note that attribute references (i.e., an expression like C.S) have semantics richer than those of attribute bindings. I cover these references in detail in "Attribute Reference Basics" on page 89.

5.1.3.2. Function definitions in a class body

Most class bodies include def statements, since functions (called methods in this context) are important attributes for most class objects. A def statement in a class body obeys the rules presented in "Functions" on page 70. In addition, a method defined in a class body always has a mandatory first parameter, conventionally named self, that refers to the instance on which you call the method. The self parameter plays a special role in method calls, as covered in "Bound and Unbound Methods" on page 91.

Here's an example of a class that includes a method definition:

class C5(object):
    def hello(self):
        print "Hello"

A class can define a variety of special methods (methods with names that have two leading and two trailing underscores) relating to specific operations on its instances. I discuss special methods in detail in "Special Methods" on page 104.

5.1.3.3. Class-private variables

When a statement in a class body (or in a method in the body) uses an identifier starting with two underscores (but not ending with underscores), such as _ _ident, the Python compiler implicitly changes the identifier into _classname_ _ident, where classname is the name of the class. This lets a class use "private" names for attributes, methods, global variables, and other purposes, reducing the risk of accidentally duplicating names used elsewhere.

By convention, all identifiers starting with a single underscore are meant to be private to the scope that binds them, whether that scope is or isn't a class. The Python compiler does not enforce this privacy convention; it's up to Python programmers to respect it.

5.1.3.4. Class documentation strings

If the first statement in the class body is a string literal, the compiler binds that string as the documentation string attribute for the class. This attribute is named _ _doc_ _ and is known as the docstring of the class. See "Docstrings" on page 72 for more information on docstrings.

5.1.4. Descriptors

A descriptor is any new-style object whose class supplies a special method named _ _get_ _. Descriptors that are class attributes control the semantics of accessing and setting attributes on instances of that class. Roughly speaking, when you access an instance attribute, Python obtains the attribute's value by calling _ _get_ _ on the corresponding descriptor, if any. For more details, see "Attribute Reference Basics" on page 89.

5.1.4.1. Overriding and nonoverriding descriptors

If a descriptor's class also supplies a special method named _ _set_ _, then the descriptor is known as an overriding descriptor (or, by an older and slightly confusing terminology, a data descriptor); if the descriptor's class supplies only _ _get_ _, and not _ _set_ _, then the descriptor is known as a nonoverriding (or nondata) descriptor. For example, the class of function objects supplies _ _get_ _, but not _ _set_ _; therefore, function objects are nonoverriding descriptors. Roughly speaking, when you assign a value to an instance attribute with a corresponding descriptor that is overriding, Python sets the attribute value by calling _ _set_ _ on the descriptor. For more details, see "Attributes of instance objects" on page 87.

Old-style classes can have descriptors, but descriptors in old-style classes always work as if they were nonoverriding ones (their _ _set_ _ method, if any, is ignored).

5.1.5. Instances

To create an instance of a class, call the class object as if it were a function. Each call returns a new instance whose type is that class:

anInstance = C5( )

You can call built-in function isinstance(I, C) with a class object as argument C. isinstance returns true if object I is an instance of class C or any subclass of C. Otherwise, isinstance returns False.

5.1.5.1. _ _init_ _

When a class defines or inherits a method named _ _init_ _, calling the class object implicitly executes _ _init_ _ on the new instance to perform any needed instance-specific initialization. Arguments passed in the call must correspond to the parameters of _ _init_ _, except for parameter self. For example, consider the following class:

class C6(object):
    def _ _init_ _(self, n):
        self.x = n

Here's how you can create an instance of the C6 class:

anotherInstance = C6(42)

As shown in the C6 class, the _ _init_ _ method typically contains statements that bind instance attributes. An _ _init_ _ method must not return a value; otherwise, Python raises a TypeError exception.

The main purpose of _ _init_ _ is to bind, and thus create, the attributes of a newly created instance. You may also bind or unbind instance attributes outside _ _init_ _, as you'll see shortly. However, your code will be more readable if you initially bind all attributes of a class instance with statements in the _ _init_ _ method.

When _ _init_ _ is absent, you must call the class without arguments, and the newly generated instance has no instance-specific attributes.

5.1.5.2. Attributes of instance objects

Once you have created an instance, you can access its attributes (data and methods) using the dot (.) operator. For example:

anInstance.hello( )                       # prints: Hello print anotherInstance.x                    # prints: 42

Attribute references such as these have fairly rich semantics in Python and are covered in detail in "Attribute Reference Basics" on page 89.

You can give an instance object an arbitrary attribute by binding a value to an attribute reference. For example:

class C7: pass z = C7( )
z.x = 23
print z.x                                # prints: 23

Instance object z now has an attribute named x, bound to the value 23, and z.x refers to that attribute. Note that the _ _setattr_ _ special method, if present, intercepts every attempt to bind an attribute. (_ _setattr_ _ is covered in _ _setattr_ _ on page 108.) Moreover, if you attempt to bind, on a new-style instance, an attribute whose name corresponds to an overriding descriptor in the instance's class, the descriptor's _ _set_ _ method intercepts the attempt. In this case, the statement z.x=23 executes type(z).x._ _set_ _(z, 23) (old-style instances ignore the overriding nature of descriptors found in their classes, i.e., they never call their _ _set_ _ methods).

Creating an instance implicitly sets two instance attributes. For any instance z, z._ _class_ _ is the class object to which z belongs, and z._ _dict_ _ is the dictionary that z uses to hold its other attributes. For example, for the instance z we just created:

print z._ _class_ _._ _name_ _, z._ _dict_ _     # prints: C7, {'x':23}

You may rebind (but not unbind) either or both of these attributes, but this is rarely necessary. A new-style instance's _ _class_ _ may be rebound only to a new-style class, and a legacy instance's _ _class_ _ may be rebound only to a legacy class.

For any instance z, any object x, and any identifier S (except _ _class_ _ and _ _dict_ _), z.S=x is equivalent to z._ _dict_ _['S']=x (unless a _ _setattr_ _ special method, or an overriding descriptor's _ _set_ _ special method, intercept the binding attempt). For example, again referring to the z we just created:

z.y = 45
z._ _dict_ _['z'] = 67
print z.x, z.y, z.z                         # prints: 23, 45, 67

There is no difference between instance attributes created in _ _init_ _ by assigning to attributes or by explicitly binding an entry in z._ _dict_ _.

5.1.5.3. The factory-function idiom

A common task is to create instances of different classes depending on some condition, or to avoid creating a new instance if an existing one is available for reuse. A common misconception is that such needs might be met by having _ _init_ _ return a particular object, but such an approach is absolutely unfeasible: Python raises an exception when _ _init_ _ returns any value other than None. The best way to implement flexible object creation is by using an ordinary function rather than calling the class object directly. A function used in this role is known as a factory function.

Calling a factory function is a flexible approach: a function may return an existing reusable instance, or create a new instance by calling whatever class is appropriate. Say you have two almost interchangeable classes (SpecialCase and NormalCase) and want to flexibly generate instances of either one of them, depending on an argument. The following appropriateCase factory function allows you to do just that (the role of the self parameter is covered in "Bound and Unbound Methods" on page 91):

class SpecialCase(object):
    def amethod(self): print "special"
class NormalCase(object):
    def amethod(self): print "normal"
def appropriateCase(isnormal=True):
    if isnormal: return NormalCase( )
    else: return SpecialCase( )
aninstance = appropriateCase(isnormal=False)
aninstance.amethod( )                     # prints "special", as desired

5.1.5.4. _ _new_ _

Each new-style class has (or inherits) a static method named _ _new_ _ (static methods are covered in "Static methods" on page 99). When you call C(*args,**kwds) to create a new instance of class C, Python first calls C._ _new_ _(C,*args,**kwds). Python uses _ _new_ _'s return value x as the newly created instance. Then, Python calls C._ _init_ _(x,*args,**kwds), but only when x is indeed an instance of C or any of its subclasses (otherwise, x's state remains as _ _new_ _ had left it). Thus, for example, the statement x=C(23) is equivalent to:

x = C._ _new_ _(
C, 23)
if isinstance(x, C): type(x)._ _init_ _(
x, 23)

object._ _new_ _ creates a new, uninitialized instance of the class it receives as its first argument. It ignores other arguments if that class has an _ _init_ _ method, but it raises an exception if it receives other arguments beyond the first, and the class that's the first argument does not have an _ _init_ _ method. When you override _ _new_ _ within a class body, you do not need to add _ _new_ _=staticmethod(_ _new_ _), as you normally would: Python recognizes the name _ _new_ _ and treats it specially in this context. In those rare cases in which you rebind C._ _new_ _ later, outside the body of class C, you do need to use C._ _new_ _=staticmethod(whatever).

_ _new_ _ has most of the flexibility of a factory function, as covered in "The factory-function idiom" on page 88. _ _new_ _ may choose to return an existing instance or make a new one, as appropriate. When _ _new_ _ does need to create a new instance, it most often delegates creation by calling object._ _new_ _ or the _ _new_ _ method of another superclass of C. The following example shows how to override static method _ _new_ _ in order to implement a version of the Singleton design pattern:

class Singleton(object):
    _singletons = {}
    def _ _new_ _(cls, *args, **kwds):
        if cls not in cls._singletons:
            cls._singletons[cls] = super(Singleton, cls)._ _new_ _(cls)
        return cls._singletons[cls]

(Built-in super is covered in "Cooperative superclass method calling" on page 97.) Any subclass of Singleton (that does not further override _ _new_ _) has exactly one instance. If the subclass defines an _ _init_ _ method, the subclass must ensure its _ _init_ _ is safe when called repeatedly (at each creation request) on the one and only class instance.

Old-style classes do not have a _ _new_ _ method.

5.1.6. Attribute Reference Basics

An attribute reference is an expression of the form x.name, where x is any expression and name is an identifier called the attribute name. Many kinds of Python objects have attributes, but an attribute reference has special rich semantics when x refers to a class or instance. Remember that methods are attributes too, so everything I say about attributes in general also applies to attributes that are callable (i.e., methods).

Say that x is an instance of class C, which inherits from base class B. Both classes and the instance have several attributes (data and methods), as follows:

class B(object):
    a = 23
    b = 45
    def f(self): print "method f in class B"
    def g(self): print "method g in class B"
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print "method g in class C"
    def h(self): print "method h in class C"
x = C( )
x.d = 77
x.e = 88

A few attribute names are special. For example, C._ _name_ _ is the string 'C' and the class name. C._ _bases_ _ is the tuple (B,), the tuple of C's base classes. x._ _class_ _ is the class C, the class to which x belongs. When you refer to an attribute with one of these special names, the attribute reference looks directly into a dedicated slot in the class or instance object and fetches the value it finds there. You cannot unbind these attributes. Rebinding them is allowed, so you can change the name or base classes of a class, or the class of an instance, on the fly, but this advanced technique is rarely necessary.

Both class C and instance x each have one other special attribute: a dictionary named _ _dict_ _. All other attributes of a class or instance, except for the few special ones, are held as items in the _ _dict_ _ attribute of the class or instance.

5.1.6.1. Getting an attribute from a class

When you use the syntax C.name to refer to an attribute on a class object C, the lookup proceeds in two steps:

  1. When 'name' is a key in C._ _dict_ _, C.name fetches the value v from C._ _dict_ _['name']. Then, if v is a descriptor (i.e., type(v) supplies a method named _ _get_ _), the value of C.name is the result of calling type(v). _ _get_ _(v, None, C). Otherwise, the value of C.name is v.

  2. Otherwise, C.name delegates the lookup to C's base classes, meaning it loops on C's ancestor classes and tries the name lookup on each (in "method resolution order," as covered in "Method resolution order" on page 94).

5.1.6.2. Getting an attribute from an instance

When you use the syntax x.name to refer to an attribute of instance x of class C, the lookup proceeds in three steps:

  1. When 'name' is found in C (or in one of C's ancestor classes) as the name of an overriding descriptor v (i.e., type(v) supplies methods _ _get_ _ and _ _set_ _), the value of C.name is the result of calling type(v)._ _get_ _(v, x, C). (This step doesn't apply to old-style instances).

  2. Otherwise, when 'name' is a key in x._ _dict_ _, x.name fetches and returns the value at x._ _dict_ _['name'].

  3. Otherwise, x.name delegates the lookup to x's class (according to the same two-step lookup used for C.name, as just detailed). If a descriptor v is found, the overall result of the attribute lookup is, again, type(v)._ _get_ _(v, x, C); if a nondescriptor value v is found, the overall result of the attribute lookup is v.

When these lookup steps do not find an attribute, Python raises an AttributeError exception. However, for lookups of x.name, if C defines or inherits special method _ _getattr_ _, Python calls C._ _getattr_ _(x,'name') rather than raising the exception (it's then up to _ _getattr_ _ to either return a suitable value or raise the appropriate exception, normally AttributeError).

Consider the following attribute references:

print x.e, x.d, x.c, x.b, x.a            # prints: 88, 77, 89, 67, 23

x.e and x.d succeed in step 2 of the instance lookup process, since no descriptors are involved, and 'e' and 'd' are both keys in x._ _dict_ _. Therefore, the lookups go no further, but rather return 88 and 77. The other three references must proceed to step 3 of the instance process and look in x._ _class_ _ (i.e., C). x.c and x.b succeed in step 1 of the class lookup process, since 'c' and 'b' are both keys in C._ _dict_ _. Therefore, the lookups go no further but rather return 89 and 67. x.a gets all the way to step 2 of the class process, looking in C._ _bases_ _[0] (i.e., B). 'a' is a key in B._ _dict_ _; therefore, x.a finally succeeds and returns 23.

5.1.6.3. Setting an attribute

Note that the attribute lookup steps happen in this way only when you refer to an attribute, not when you bind an attribute. When you bind (on either a class or an instance) an attribute whose name is not special (unless a _ _setattr_ _ method, or the _ _set_ _ method of an overriding descriptor, intercepts the binding of an instance attribute), you affect only the _ _dict_ _ entry for the attribute (in the class or instance, respectively). In other words, in the case of attribute binding, there is no lookup procedure involved, except for the check for overriding descriptors.

5.1.7. Bound and Unbound Methods

Method _ _get_ _ of a function object returns an unbound method object or a bound method object that wraps the function. The key difference between unbound and bound methods is that an unbound method is not associated with a particular instance while a bound method is.

In the code in the previous section, attributes f, g, and h are functions; therefore, an attribute reference to any one of them returns a method object that wraps the respective function. Consider the following:

print x.h, x.g, x.f, C.h, C.g, C.f

This statement outputs three bound methods represented by strings like:

<bound method C.h of <_ _main_ _.C object at 0x8156d5c>>

and then three unbound ones represented by strings like:

<unbound method C.h>

We get bound methods when the attribute reference is on instance x, and unbound methods when the attribute reference is on class C.

Because a bound method is already associated with a specific instance, you call the method as follows:

x.h( )                      # prints: method h in class C

The key thing to notice here is that you don't pass the method's first argument, self, by the usual argument-passing syntax. Rather, a bound method of instance x implicitly binds the self parameter to object x. Thus, the body of the method can access the instance's attributes as attributes of self, even though we don't pass an explicit argument to the method.

An unbound method, however, is not associated with a specific instance, so you must specify an appropriate instance as the first argument when you invoke an unbound method. For example:

C.h(x)                     # prints: method h in class C

You call unbound methods far less frequently than bound methods. The main use for unbound methods is for accessing overridden methods, as discussed in "Inheritance" on page 94; moreover, even for that task, it's generally better to use the super built-in covered in "Cooperative superclass method calling" on page 97.

5.1.7.1. Unbound method details

As we've just discussed, when an attribute reference on a class refers to a function, a reference to that attribute returns an unbound method that wraps the function. An unbound method has three attributes in addition to those of the function object it wraps: im_class is the class object supplying the method, im_func is the wrapped function, and im_self is always None. These attributes are all read-only, meaning that trying to rebind or unbind any of them raises an exception.

You can call an unbound method just as you would call its im_func function, but the first argument in any call must be an instance of im_class or a descendant. In other words, a call to an unbound method must have at least one argument, which corresponds to the wrapped function's first formal parameter (conventionally named self).

5.1.7.2. Bound method details

When an attribute reference on an instance, in the course of the lookup, finds a function object that's an attribute in the instance's class, the lookup calls the function's _ _get_ _ method to obtain the attribute's value. The call, in this case, creates and returns a bound method that wraps the function.

Note that when the attribute reference's lookup finds a function object in x._ _dict_ _, the attribute reference operation does not create a bound method because in such cases the function is not treated as a descriptor, and the function's _ _get_ _ method does not get called; rather, the function object itself is the attribute's value. Similarly, no bound method is created for callables that are not ordinary functions, such as built-in (as opposed to Python-coded) functions, since they are not descriptors.

A bound method is similar to an unbound method in that it has three read-only attributes in addition to those of the function object it wraps. Like in an unbound method, im_class is the class object that supplies the method, and im_func is the wrapped function. However, in a bound method object, attribute im_self refers to x, the instance from which the method was obtained.

A bound method is used like its im_func function, but calls to a bound method do not explicitly supply an argument corresponding to the first formal parameter (conventionally named self). When you call a bound method, the bound method passes im_self as the first argument to im_func before other arguments (if any) given at the point of call.

Let's follow in excruciating low-level detail the conceptual steps involved in a method call with the normal syntax x.name(arg). In the following context:

def f(a, b):...            # a function f with two arguments

class C(object):
    name = f x = C( )

x is an instance object of class C, name is an identifier that names a method of x's (an attribute of C whose value is a function, in this case function f), and arg is any expression. Python first checks if 'name' is the attribute name in C of an overriding descriptor, but it isn'tfunctions are descriptors, because their class defines method _ _get_ _, but not overriding ones, because their class does not define method _ _set_ _. Python next checks if 'name' is a key in x._ _dict_ _, but it isn't. So Python finds name in C (everything would work in just the same way if name was found, by inheritance, in one of C's _ _bases_ _). Python notices that the attribute's value, function object f, is a descriptor. Therefore, Python calls f._ _get_ _(x, C), which creates a bound method object with im_func set to f, im_class set to C, and im_self set to x. Then Python calls this bound method object, with arg as the only actual argument. The bound method inserts im_self (i.e., x) as the first actual argument, and arg becomes the second one, in a call to the bound method's im_func (i.e., function f). The overall effect is just like calling:

x._ _class_ _._ _dict_ _['name']
(x, arg)

When a bound method's function body executes, it has no special namespace relationship to either its self object or any class. Variables referenced are local or global, just as for any other function, as covered in "Namespaces" on page 76. Variables do not implicitly indicate attributes in self, nor do they indicate attributes in any class object. When the method needs to refer to, bind, or unbind an attribute of its self object, it does so by standard attribute-reference syntax (e.g., self.name). The lack of implicit scoping may take some getting used to (since Python differs in this respect from many other object-oriented languages), but it results in clarity, simplicity, and the removal of potential ambiguities.

Bound method objects are first-class objects, and you can use them wherever you can use a callable object. Since a bound method holds references to the function it wraps, and to the self object on which it executes, it's a powerful and flexible alternative to a closure (covered in "Nested functions and nested scopes" on page 77). An instance object whose class supplies special method _ _call_ _ (covered in _ _call_ _ on page 105) offers another viable alternative. Each of these constructs lets you bundle some behavior (code) and some state (data) into a single callable object. Closures are simplest, but limited in their applicability. Here's the closure from "Nested functions and nested scopes" on page 77:

def make_adder_as_closure(augend):
    def add(addend, _augend=augend): return addend+_augend
    return add

Bound methods and callable instances are richer and more flexible than closures. Here's how to implement the same functionality with a bound method:

def make_adder_as_bound_method(augend):
    class Adder(object):
        def _ _init_ _(self, augend): self.augend = augend
        def add(self, addend): return addend+self.augend
    return Adder(augend).add

Here's how to implement it with a callable instance (an instance whose class supplies special method _ _call_ _):

def make_adder_as_callable_instance(augend):
    class Adder(object):
        def _ _init_ _(self, augend): self.augend = augend
        def _ _call_ _(self, addend): return addend+self.augend
    return Adder(augend)

From the viewpoint of the code that calls the functions, all of these factory functions are interchangeable, since all of them return callable objects that are polymorphic (i.e., usable in the same ways). In terms of implementation, the closure is simplest; the bound method and the callable instance use more flexible, general, and powerful mechanisms, but there is really no need for that extra power in this simple example.

5.1.8. Inheritance

When you use an attribute reference C.name on a class object C, and 'name' is not a key in C._ _dict_ _, the lookup implicitly proceeds on each class object that is in C._ _bases_ _ in a specific order (which for historical reasons is known as the method resolution order, or MRO, even though it's used for all attributes, not just methods). C's base classes may in turn have their own bases. The lookup checks direct and indirect ancestors, one by one, in MRO, stopping when 'name' is found.

5.1.8.1. Method resolution order

The lookup of an attribute name in a class essentially occurs by visiting ancestor classes in left-to-right, depth-first order. However, in the presence of multiple inheritance (which makes the inheritance graph a general Directed Acyclic Graph rather than specifically a tree), this simple approach might lead to some ancestor class being visited twice. In such cases, the resolution order is clarified by leaving in the lookup sequence only the rightmost occurrence of any given class. This last, crucial simplification is not part of the specifications for the legacy object model, making multiple inheritance hard to use correctly and effectively within that object model. The new-style object model is vastly superior in this regard.

The problem with purely left-right, depth-first search, in situations of multiple inheritance, can be easily demonstrated with an example based on old-style classes:

class Base1:
    def amethod(self): print "Base1"
class Base2(Base1): pass class Base3:
    def amethod(self): print "Base3"
class Derived(Base2, Base3): pass aninstance = Derived( )
aninstance.amethod( )                    # prints: "Base1"

In this case, the lookup for amethod starts in Derived. When it isn't found there, lookup proceeds to Base2. Since the attribute isn't found in Base2, the legacy-style lookup then proceeds to Base2's ancestor, Base1, where the attribute is found. Therefore, the legacy-style lookup stops at this point and never considers Base3, where it would also find an attribute with the same name. The new-style MRO solves this problem by removing the leftmost occurrence of Base1 from the search so that the occurrence of amethod in Base3 is found instead.

Figure 5-1 shows the legacy and new-style MROs for the case of this kind of "diamond-shaped" inheritance graph.

Figure 5-1. Legacy and new-style MRO


Each new-style class and built-in type has a special read-only class attribute called _ _mro_ _, which is the tuple of types used for method resolution, in order. You can reference _ _mro_ _ only on classes, not on instances, and, since _ _mro_ _ is a read-only attribute, you cannot rebind or unbind it. For a detailed and highly technical explanation of all aspects of Python's MRO, you may want to study a paper by Michele Simionato, "The Python 2.3 Method Resolution Order," at http://www.python.org/2.3/mro.html.

5.1.8.2. Overriding attributes

As we've just seen, the search for an attribute proceeds along the MRO (typically up the inheritance tree) and stops as soon as the attribute is found. Descendant classes are always examined before their ancestors so that, when a subclass defines an attribute with the same name as one in a superclass, the search finds the definition in the subclass and stops there. This is known as the subclass overriding the definition in the superclass. Consider the following:

class B(object):
    a = 23
    b = 45
    def f(self): print "method f in class B"
    def g(self): print "method g in class B"
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print "method g in class C"
    def h(self): print "method h in class C"

In this code, class C overrides attributes b and g of its superclass B. Note that, unlike in some other languages, in Python you may override data attributes just as easily as callable attributes (methods).

5.1.8.3. Delegating to superclass methods

When a subclass C overrides a method f of its superclass B, the body of C.f often wants to delegate some part of its operation to the superclass's implementation of the method. This can sometimes be done using an unbound method, as follows:

class Base(object):
    def greet(self, name): print "Welcome ", name class Sub(Base):
    def greet(self, name):
        print "Well Met and",
        Base.greet(self, name)
x = Sub( )
x.greet('Alex')

The delegation to the superclass, in the body of Sub.greet, uses an unbound method obtained by attribute reference Base.greet on the superclass, and therefore passes all attributes normally, including self. Delegating to a superclass implementation is the most frequent use of unbound methods.

One common use of delegation occurs with special method _ _init_ _. When Python creates an instance, the _ _init_ _ methods of base classes are not automatically invoked, as they are in some other object-oriented languages. Thus, it is up to a subclass to perform the proper initialization by using delegation if necessary. For example:

class Base(object):
    def _ _init_ _(self):
        self.anattribute = 23
class Derived(Base):
    def _ _init_ _(self):
        Base._ _init_ _(self)
        self.anotherattribute = 45

If the _ _init_ _ method of class Derived didn't explicitly call that of class Base, instances of Derived would miss that portion of their initialization, and thus such instances would lack attribute anattribute.

5.1.8.4. Cooperative superclass method calling

Calling the superclass's version of a method with unbound method syntax, however, is quite problematic in cases of multiple inheritance with diamond-shaped graphs. Consider the following definitions:

class A(object):
    def met(self):
        print 'A.met'
class B(A):
    def met(self):
        print 'B.met'
        A.met(self)
class C(A):
    def met(self):
        print 'C.met'
        A.met(self)
class D(B,C):
    def met(self):
        print 'D.met'
        B.met(self)
        C.met(self)

In this code, when we call D( ).met( ), A.met ends up being called twice. How can we ensure that each ancestor's implementation of the method is called once, and only once? The solution is to use built-in type super. super(aclass, obj), which returns a special superobject of object obj. When we look up an attribute (e.g., a method) in this superobject, the lookup begins after class aclass in obj's MRO. We can therefore rewrite the previous code as:

class A(object):
    def met(self):
        print 'A.met'
class B(A):
    def met(self):
        print 'B.met'
        super(B,self).met( )
class C(A):
    def met(self):
        print 'C.met'
        super(C,self).met( )
class D(B,C):
    def met(self):
        print 'D.met'
        super(D,self).met( )

Now, D( ).met( ) results in exactly one call to each class's version of met. If you get into the habit of always coding superclass calls with super, your classes will fit smoothly even in complicated inheritance structures. There are no ill effects whatsoever if the inheritance structure instead turns out to be simple, as long, of course, as you're only using the new-style object model, as I recommend.

The only situation in which you may prefer to use the rougher approach of calling a superclass method through the unbound-method technique is when the various classes have different and incompatible signatures for the same methodan unpleasant situation in many respects, but, if you do have to deal with it, the unbound-method technique may sometimes be the least of evils. Proper use of multiple inheritance will be seriously hamperedbut then, even the most fundamental properties of OOP, such as polymorphism between base and subclass instances, are seriously impaired when corresponding methods have different and incompatible signatures.

5.1.8.5. "Deleting" class attributes

Inheritance and overriding provide a simple and effective way to add or modify class attributes (particularly methods) noninvasively (i.e., without modifying the class in which the attributes are defined) by adding or overriding the attributes in subclasses. However, inheritance does not offer a way to delete (hide) base classes' attributes noninvasively. If the subclass simply fails to define (override) an attribute, Python finds the base class's definition. If you need to perform such deletion, possibilities include:

  • Override the method and raise an exception in the method's body.

  • Eschew inheritance, hold the attributes elsewhere than in the subclass's _ _dict_ _, and define _ _getattr_ _ for selective delegation.

  • Use the new-style object model and override _ _getattribute_ _ to similar effect.

The last of these techniques is demonstrated in "_ _getattribute_ _" on page 102.

5.1.9. The Built-in object Type

The built-in object type is the ancestor of all built-in types and new-style classes. The object type defines some special methods (documented in "Special Methods" on page 104) that implement the default semantics of objects:


_ _new_ _ _ _init_ _

You can create a direct instance of object by calling object( ) without any arguments. The call implicitly uses object._ _new_ _ and object._ _init_ _ to make and return an instance object without attributes (and without even a _ _dict_ _ in which to hold attributes). Such an instance object may be useful as a "sentinel," guaranteed to compare unequal to any other distinct object.


_ _delattr_ _ _ _getattribute_ _ _ _setattr_ _

By default, an object handles attribute references (as covered in "Attribute Reference Basics" on page 89) using these methods of object.


_ _hash_ _ _ _repr_ _ _ _str_ _

Any object can be passed to functions hash and repr and to type str.

A subclass of object may override any of these methods and/or add others.

5.1.10. Class-Level Methods

Python supplies two built-in nonoverriding descriptors types, which give a class two distinct kinds of "class-level methods."

5.1.10.1. Static methods

A static method is a method that you can call on a class, or on any instance of the class, without the special behavior and constraints of ordinary methods, bound and unbound, with regard to the first parameter. A static method may have any signature; it may have no parameters, and the first parameter, if any, plays no special role. You can think of a static method as an ordinary function that you're able to call normally, despite the fact that it happens to be bound to a class attribute. While it is never necessary to define static methods (you can always define a normal function instead), some programmers consider them to be an elegant alternative when a function's purpose is tightly bound to some specific class.

To build a static method, call built-in type staticmethod and bind its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may also choose to perform it elsewhere. The only argument to staticmethod is the function to invoke when Python calls the static method. The following example shows how to define and call a static method:

class AClass(object):
    def astatic( ): print 'a static method'
    astatic = staticmethod(astatic)
anInstance = AClass( )
AClass.astatic( )                    # prints: a static method anInstance.astatic( )                # prints: a static method

This example uses the same name for the function passed to staticmethod and for the attribute bound to staticmethod's result. This style is not mandatory, but it's a good idea, and I recommend you always use it. Python 2.4 offers a special, simplified syntax to support this style, covered in "Decorators" on page 115.

5.1.10.2. Class methods

A class method is a method you can call on a class or on any instance of the class. Python binds the method's first parameter to the class on which you call the method, or the class of the instance on which you call the method; it does not bind it to the instance, as for normal bound methods. There is no equivalent of unbound methods for class methods. The first parameter of a class method is conventionally named cls. While it is never necessary to define class methods (you could always alternatively define a normal function that takes the class object as its first parameter), some programmers consider them to be an elegant alternative to such functions.

To build a class method, call built-in type classmethod and bind its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may also choose to perform it elsewhere. The only argument to classmethod is the function to invoke when Python calls the class method. Here's how you can define and call a class method:

class ABase(object):
    def aclassmet(cls): print 'a class method for', cls._ _name_ _
    aclassmet = classmethod(aclassmet)
class ADeriv(ABase): pass bInstance = ABase( )
dInstance = ADeriv( )
ABase.aclassmet( )               # prints: a class method for ABase bInstance.aclassmet( )           # prints: a class method for ABase ADeriv.aclassmet( )              # prints: a class method for ADeriv dInstance.aclassmet( )           # prints: a class method for ADeriv

This example uses the same name for the function passed to classmethod and for the attribute bound to classmethod's result. This style is not mandatory, but it's a good idea, and I recommend that you always use it. Python 2.4 offers a special, simplified syntax to support this style, covered in "Decorators" on page 115.

5.1.11. Properties

Python supplies a built-in overriding descriptor type, which you may use to give a class's instances properties.

A property is an instance attribute with special functionality. You reference, bind, or unbind the attribute with the normal syntax (e.g., print x.prop, x.prop=23, del x.prop). However, rather than following the usual semantics for attribute reference, binding, and unbinding, these accesses call on instance x the methods that you specify as arguments to the built-in type property. Here's how you define a read-only property:

class Rectangle(object):
    def _ _init_ _(self, width, height):
        self.width = width
        self.height = height
    def getArea(self):
        return self.width * self.height
    area = property(getArea, doc='area of the rectangle')

Each instance r of class Rectangle has a synthetic read-only attribute r.area, computed on the fly in method r.getArea( ) by multiplying the sides of the rectangle. The docstring Rectangle.area._ _doc_ _ is 'area of the rectangle'. Attribute r.area is read-only (attempts to rebind or unbind it fail) because we specify only a get method in the call to property, no set or del methods.

Properties perform tasks similar to those of special methods _ _getattr_ _, _ _setattr_ _, and _ _delattr_ _ (covered in "General-Purpose Special Methods" on page 104), but in a faster and simpler way. You build a property by calling built-in type property and binding its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may also choose to perform it elsewhere. Within the body of a class C, use the following syntax:

attrib = property(fget=None,
fset=None, fdel=None,
doc=None)

When x is an instance of C and you reference x.attrib, Python calls on x the method you passed as argument fget to the property constructor, without arguments. When you assign x.attrib = value, Python calls the method you passed as argument fset, with value as the only argument. When you execute del x.attrib, Python calls the method you passed as argument fdel, without arguments. Python uses the argument you passed as doc as the docstring of the attribute. All parameters to property are optional. When an argument is missing, the corresponding operation is forbidden (Python raises an exception when some code attempts that operation). For example, in the Rectangle example, we made property area read-only, because we passed an argument only for parameter fget, and not for parameters fset and fdel.

5.1.11.1. Why properties are important

The crucial importance of properties is that their existence makes it perfectly safe and indeed advisable for you to expose public data attributes as part of your class's public interface. If it ever becomes necessary, in future versions of your class or other classes that need to be polymorphic to it, to have some code executed when the attribute is referenced, rebound, or unbound, you know you will be able to change the plain attribute into a property and get the desired effect without any impact on any other code that uses your class (a.k.a. "client code"). This lets you avoid goofy idioms, such as accessor and mutator methods, required by OO languages that lack properties or equivalent machinery. For example, client code can simply use natural idioms such as:

someInstance.widgetCounter += 1

rather than being forced into contorted nests of accessors and mutators such as:

someInstance.setWidgetCounter(someInstance.getWidgetCounter( ) + 1)

If at any time you're tempted to code methods whose natural names are something like getThis or setThat, consider wrapping those methods into properties, for clarity.

5.1.11.2. Properties and inheritance

Properties are inherited normally, just like any other attribute. However, there's a little trap for the unwary: the methods called upon to access a property are those that are defined in the class in which the property itself is defined, without intrinsic use of further overriding that may happen in subclasses. For example:

class B(object):
  def f(self): return 23
  g = property(f)
class C(B):
  def f(self): return 42
c = C( )
print c.g                 # prints 23, not 42

The access to property c.g calls B.f, not C.f as you might intuitively expect. The reason is quite simple: the property is created by passing the function object f (and is created at the time when the class statement for B executes, so the function object in question is the one also known as B.f). The fact that the name f is later redefined in subclass C is therefore quite irrelevant, since the property performs no lookup for that name, but rather uses the function object it was passed at creation time. If you need to work around this issue, you can always do it with one extra level of indirection:

class B(object):
  def f(self): return 23
  def _f_getter(self): return self.f( )
  g = property(_f_getter)
class C(B):
  def f(self): return 42
c = C( )
print c.g                 # prints 42, as expected

Here, the function object held by the property is B._f_getter, which in turn does perform a lookup for name f (since it calls self.f( )); therefore, the overriding of f has the expected effect.

5.1.12. _ _slots_ _

Normally, each instance object x of any class C has a dictionary x._ _dict_ _ that Python uses to let you bind arbitrary attributes on x. To save a little memory (at the cost of letting x have only a predefined set of attribute names), you can define in a new-style class C a class attribute named _ _slots_ _, a sequence (normally a tuple) of strings (normally identifiers). When a new-style class C has an attribute _ _slots_ _, a direct instance x of class C has no x._ _dict_ _, and any attempt to bind on x any attribute whose name is not in C._ _slots_ _ raises an exception. Using _ _slots_ _ lets you reduce memory consumption for small instance objects that can do without the powerful and convenient ability to have arbitrarily named attributes. _ _slots_ _ is worth adding only to classes that can have so many instances that saving a few tens of bytes per instance is importanttypically classes that can have millions, not mere thousands, of instances alive at the same time. Unlike most other class attributes, _ _slots_ _ works as I've just described only if some statement in the class body binds it as a class attribute. Any later alteration, rebinding, or unbinding of _ _slots_ _ has no effect, nor does inheriting _ _slots_ _ from a base class. Here's how to add _ _slots_ _ to the Rectangle class defined earlier to get smaller (though less flexible) instances:

class OptimizedRectangle(Rectangle):
    _ _slots_ _ = 'width', 'height'

We do not need to define a slot for the area property. _ _slots_ _ does not constrain properties, only ordinary instance attributes, which are the attributes that would reside in the instance's _ _dict_ _ if _ _slots_ _ wasn't defined.

5.1.13. _ _getattribute_ _

All references to instance attributes for new-style instances proceed through special method _ _getattribute_ _. This method is supplied by base class object, where it implements all the details of object attribute reference semantics documented in "Attribute Reference Basics" on page 89. However, you may override _ _getattribute_ _ for special purposes, such as hiding inherited class attributes (e.g., methods) for your subclass's instances. The following example shows one way to implement a list without append in the new-style object model:

class listNoAppend(list):
    def _ _getattribute_ _(self, name):
        if name == 'append': raise AttributeError, name
        return list._ _getattribute_ _(self, name)

An instance x of class listNoAppend is almost indistinguishable from a built-in list object, except that performance is substantially worse, and any reference to x.append raises an exception.

5.1.14. Per-Instance Methods

Both the legacy and new-style object models allow an instance to have instance-specific bindings for all attributes, including callable attributes (methods). For a method, just like for any other attribute (except those bound to overriding descriptors in new-style classes), an instance-specific binding hides a class-level binding: attribute lookup does not consider the class when it finds a binding directly in the instance. In both object models, an instance-specific binding for a callable attribute does not perform any of the transformations detailed in "Bound and Unbound Methods" on page 91. In other words, the attribute reference returns exactly the same callable object that was earlier bound directly to the instance attribute.

Legacy and new-style object models do differ on the effects of per-instance bindings of the special methods that Python invokes implicitly as a result of various operations, as covered in "Special Methods" on page 104. In the classic object model, an instance may usefully override a special method, and Python uses the per-instance binding even when invoking the method implicitly. In the new-style object model, implicit use of special methods always relies on the class-level binding of the special method, if any. The following code shows this difference between the legacy and new-style object models:

def fakeGetItem(idx): return idx class Classic: pass c = Classic( )
c._ _getitem_ _ = fakeGetItem print c[23]                       # prints: 23
class NewStyle(object): pass n = NewStyle( )
n._ _getitem_ _ = fakeGetItem print n[23]                       # results in:
# Traceback (most recent call last):
#   File "<stdin>", line 1, in ?
# TypeError: unindexable object

The semantics of the classic object model in this regard are sometimes handy for tricky and somewhat obscure purposes. However, the new-style object model's approach is more general, and it regularizes and simplifies the relationship between classes and metaclasses, covered in "Metaclasses" on page 116.

5.1.15. Inheritance from Built-in Types

A new-style class can inherit from a built-in type. However, a class may directly or indirectly subclass multiple built-in types only if those types are specifically designed to allow this level of mutual compatibility. Python does not support unconstrained inheritance from multiple arbitrary built-in types. Normally, a new-style class only subclasses at most one substantial built-in typethis means at most one built-in type in addition to object, which is the superclass of all built-in types and new-style classes and imposes no constraints on multiple inheritance. For example:

class noway(dict, list): pass

raises a TypeError exception, with a detailed explanation of "Error when calling the metaclass bases: multiple bases have instance lay-out conflict." If you ever see such error messages, it means that you're trying to inherit, directly or indirectly, from multiple built-in types that are not specifically designed to cooperate at such a deep level.


Previous Page
Next Page