chapter 8: Classes and Objects

8.1. Changing the String Representation of Instances

It's good practice to define both _repr_() and _str_()

the _repr_() method of a class: the literal representation of a object

eval(repr(x)) = x It it not possiable to create an object from the repr(x) results, then the repr(x) result should be enclosed in '<>'

the _str_() method of a class: the toString method of a object

The method will be called when the object is passed to print() function If _str_() is not provided, then _repr_() will be used.

the format function: positional field, by {N}, N means the nth parameter

ValueError: cannot switch from manual field specification to automatic field numbering If you put a numbers to a field, then you should put numbers to all field.

a = '{0}, {1},  {1}'.format(1, 2)
print(a)

Get an object's attribute by {N.atttname} syntax

import itertools
a = '{0}, {0.chain}, {0.permutations}'.format(itertools)
print(a)

for {0!r} or {0!s}, '!r' means use _repr_(), '!s' means use _str_(). '!s' is the default value.

8.2. Customizing String Formatting

the format(aobj[, formatspec]) builtin function

The function is equal to: aobj._format_(formatspec) 而一般的aobj._format_(spec) 的实现是调用 str.format(…) 函数来实现。

str.format(…) method 还支持关键字参数来指定field name.(问题:当关键字参数与普通参数混合时会发生什么?) {:spec} 中的 spec 会传给 aobj._format_(formatspec) 作为参数。 spec 可以为任意字符串,它可以作为参数传递给aobj._format_() method.

str.format(aobj)时, 到底是哪个method会被调用呢? From below codes, it can be see that if format method is defined, then format will be called. else str will be called, when the object is formated by the str.format(…) method. For str(aobj), aobj._str__ will always be called.

class Point:
    def __init__(self, x, y):
	self.x = x
	self.y = y

    def __repr__(self):
	print("__repr__ called")
	return 'Point({0.x}, {0.y})'.format(self)

    def __str__(self):
	print("__str__ called")
	return '({0.x}, {0.y})'.format(self)

    def __format__(self, spec):
	print("__format__ called. spec: %s." % spec)
	return '({0.x}, {0.y})'.format(self)

p = Point(2, 3)
a = '{}'.format(p)
print(a)
print(p)

[NOT FINISHED]8.3. Making Objects Support the Context-Management Protocol, that is, the with statement

To provide with statement support, just define two methods:

  1. _enter_(self)
  2. _exit_(self, excty, excval, tb)
class SaveVar:
    def __init__(self, avar):
	self.avar = avar
    def __enter__(self):
	print("__enter__ called")

8.5. Encapsulating Names in a Class

one underscore _ means private variable, just convention, you can still access that variable outside of a class

class Person:
    def __init__(self, name, age):
	self._name = name
	self._age = age
    def __str__(self):
	return '(name: {}, age: {})'.format(self._name, self._age)

p = Person('Jim', 23)
print(p)
print(p._name, p._age)

two underscore __ means name mangling, when used for inheritance

the variable will be renamed to _C_name. Then it will not override the super class's variable. Because it is also has one leading underscore, so the rules for one underscore also applies.

_age is renamed to _Person_age:

class Person:
    def __init__(self, name, age):
	self._name = name
	self.__age = age
    def __str__(self):
	return '(name: {}, age: {})'.format(self._name, self.__age)

p = Person('Jim', 23)
print(p)
print(dir(p))
print(p._name, p._Person__age)

8.6. Creating Managed Attributes, with @property decorator/annotation, add a setter, getter, deleter to a field

Steps:

  1. first create a property object by @property decorator, on a getter method. The name of the getter should be the same with the attribute field.
  2. create the setter object: by @attributename.setter, on a setter method. The name of the setter should be the same with the attribute field.
  3. the getter, setter function are a way to define what will be called when the attribute with the same name is get, set. e.g. the attribute name is 'foo', then the 'foo' attribute will be a object that has methods: 'getter', 'setter', 'deleter'. You can choose any name to store the real value for this attribute, but the most common value will be add a underscore, that is 'foo'. Type of 'foo' is <class 'property'>
    class Person:
        def __init__(self, name, age):
    	# here the name attribute is depend on the def name(self) getter function. Not the reverse.
    	self.name = name
    	self.age = age
        def __str__(self):
    	return '(name: {}, age: {})'.format(self.name, self.age)
    
        @property
        def name(self):
    	print("getting name")
    	return self.nameL
    
        @name.setter
        def name(self, name):
    	print("setting name")
    	if not isinstance(name, str):
    	    raise TypeError
    
    	self.nameL = name
    
    p = Person('Jim', 23)
    p.name = "Tom"
    print(p)
    print(dir(p))
    print(type(Person.name), dir(Person.name))
    

create caculate attribute by @property, getter, setter, then the attribute works like a attribute, not a method

Seems a good application of @property.

class Circle:
    def __init__(self, radis):
	self.radis = radis

    @property
    def area(self):
	print("getting area")
	return self.radis*self.radis*3.14

p = Circle(4)
print(p.radis)
print(p.area)

8.7. Calling a Method on a Parent Class, by super() function

There are many format

super() # unbound
super(type, obj) # isinstance(obj, type)
super(type, type2) # issubclass(type2, type). issubclass(object, object) is True

8.9. Creating a New Kind of Class or Instance Attribute, by creating a descriptor class for the type

如果一个类定义了三个函数: get, set, delete, 则它是一个descriptor, 可能通过它来为一个instance的attribute添加一些get, set时的函数。

@property 只是descriptor的一种表象, descriptor是最底层,最灵活的实现,在库中大量使用。 TODO: 可以再研究下基于descriptor, @property的实现。

调用顺序:如果descriptor对应的class attribute 存在, 则总会优先调用这个descriptor的函数,来获取或设置attribute的值。 但当descriptor只定义了_get_方法时,则如果同名的变量在instance._dict_中存在,则会优先从instance._dict_中获取。

class Integer:
    def __init__(self, name):
	self.name = name

    def __get__(self, instance, cls):
	print("__get__ method called, name: %s" % self.name)
	# If instance is None, then it is the class attribute
	if instance:
	    return instance.__dict__[self.name]
	else:
	    return instance

    def __set__(self, instance, value):
	print("__set__ method called, name %s, value: %s" % (self.name, value))
	instance.__dict__[self.name] = value

class Point:
    # 关键的是量的值,输入参数的值只是用于内部实现的。并且Integer的实现中使用instance.__dict__保存数据也只是一种实现方式。
    # Point.x决定了atribute的名称为x
    x = Integer('z')
    def __init__(self, x, y):
	self.x = x
	self.y = y

    def __str__(self):
	return '({0.x}, {0.y})'.format(self)


p = Point(3, 2)
print("p.x")
print(p.x)
print(p.__dict__)
# setattr(p, 'x', 5)
p.__dict__['x'] = 5
print(p.x)
print(p.__dict__)

8.10. Using Lazily Computed Properties, an application of descriptor

8.11. Simplifying the Initialization of Data Structures, by define a common base class

# python  is very flexiable
class Structure:
    _fields = []
    def __init__(self, *args):
	if len(self._fields) != len(args):
	    raise TypeError('Expected {} arguments'.format(len(self._fields)))
	for k, v in zip(self._fields, args):
	    setattr(self, k, v)
    def __str__(self):
	return '({})'.format(', '.join('{}: {}'.format(f, getattr(self, f)) for f in self._fields))

class Point(Structure):
    _fields = ['x', 'y']
    # def __str__(self):
    #     return '(x: {0.x}, y: {0.y})'.format(self)

class Circle(Structure):
    _fields = ['radius']
    # def __str__(self):
    #     return '(radius: {0.radius})'.format(self)

p = Point(1, 2)
print(p)
p = Circle(3)
print(p)

class attributes can also be accessed by instance object, such as self, but only when the same instance attribute not exists

class Foo:
    class_attr = "ABC"
    def __init__(self, a):
	self.a = a

f = Foo('BB')
print(f.class_attr, f.a)
print(f.class_attr is Foo.class_attr)

class Bar:
    class_attr = "ABC"
    def __init__(self, a):
	self.class_attr = a
	self.a = a

b = Bar('BB')
print(b.class_attr, b.a)
print(b.class_attr is Bar.class_attr)

8.12. Defining an Interface or Abstract Base Class

create an abstract base class, or interface, by abc.ABCMeta, abc.abstractmethod

A abstract class can't be initialized.

from abc import ABCMeta, abstractmethod
class IStream(metaclass=ABCMeta):
    @abstractmethod
    def read(self, maxbytes=-1):
	pass
    @abstractmethod
    def write(self, data):
	pass

# typical usage:
def  foo(obj):
    if isinstance(obj, IStream):
	# processing an IStream here
	pass

a = IStream()

register another class to a 'sub class ' of a abstract base class, by abc.register(cls) function

Then isinstance(obj, AbstractBaseClass) will be True. This let another class which is not a subclass of a base class, but can still pass the isinstance() test, which means implementing a interface.

import io
# Register the built-in I/O classes as supporting our interface
IStream.register(io.IOBase)
# Open a normal file and type check
f = open('foo.txt')
isinstance(f, IStream) # Returns True

8.13. Implementing a Data Model or Type System, by descriptor

感觉根之前小节讲到的descriptor相同,只不过用了继承的方式写了很多细小的descriptor。

what is a descriptor? and its usage

A descriptor is a class attribute object, which has get, set, or delete method, is used to define how a instance attribute is get, set, and delete. When an instance attribute is get, the descriptor's get method will be called. The same thing applys to set and delete

In descriptor's get, set methods, we must use instance._dict_[xxx] to get a attribute. If we use getattr(instance, xxx) to get that attribute, then there will be a recursion error as below, because the getattr() function will trigger a new call of get method. RecursionError: maximum recursion depth exceeded while calling a Python object

The relationship between the descriptor object and an instance attribute:

  1. if the descriptor object is assigned to a class attribute with name 'attributea', then it will control the instance attribute with the same name.
  2. but there is one exception: if only the get method of a descriptor is defined, then the instance attribute with the same name will be not be controled by the descriptor, it will be get directly from the dict.

a test:

class TraceDescriptor:
    def __init__(self, name):
	self.name = name

    def __get__(self, instance, cls):
	if instance:
	    print('Getting attribute {}, value is {}'.format(self.name, instance.__dict__[self.name]))
	    return instance.__dict__[self.name]
	    # return getattr(instance, self.name)
	else:
	    return instance


    def __set__(self, instance, value):
	print('Setting attribute {} to {}'.format(self.name, value))
	instance.__dict__[self.name] = value

class Circle:
    radius = TraceDescriptor('radius')
    def __init__(self,  radius):
	self.radius = radius

c =  Circle(4)

print(c.radius)

8.16. Defining More Than One Constructor in a Class, use a class method

GP: Always only assign values in the default constructor(init), and do other things by other constructors

import time

class Date:
    def __init__(self, y, m,  d):
	self.year = y
	self.month = m
	self.day =  d

    @classmethod
    def today(cls):
	t = time.localtime()
	return cls(t.tm_year, t.tm_mon, t.tm_mday)

    def __str__(self):
	return '({0.year}, {0.month}, {0.day})'.format(self)

d1 = Date(2017, 1, 2)
d2 = Date.today()
print(d1, d2)

8.17. Creating an Instance Without Invoking init

the object._new_(*args, **kwargs) method: create a bare object

Every object has a new_method, which is inheritantanted from type._new.

The parameter should be a type object.

When you want to create an object from a json, this method can be used.

import aspk_common as AC
class Foo(AC.Structure):
    _fields = ['x']

f = Foo(2)
g = Foo.__new__(Foo)
print(f)
print(f.__dict__)
print(g.__dict__)
print(dir(f))
print(dir(g))

Problem: how an object is constructed?

I guess first create a bare object by calling the new method, then call the object's init method.

8.18. Extending Classes with Mixins

mixin classes, used to extend function of a class, class customization, by multiple inheritance

SOLVED, see another comment. How below codes works? For 'super()._getitem_(key)', why dict._getitem__ method will be called?

After figuring out MRO, then I know how a mixin class works: Mixin class is used to customize an existing class. It make use of MRO of multiple inheritance. Suppose 'Base' is the class to be customized, 'Mixin' is the mixin class, 'Foo' is the result class, then the typical syntax is:

class Foo(Mixin, Base):
    pass

That is, put the mixin class as the first parent class, and the Base class as the second class. Then e.g. you want change a method of Base's behavier, such as 'foo', then you can just define a method named 'foo' in Mixin, and doing some work, then call 'super().foo(…)' to call Base's foo method.

Works like a decorator pattern.

But what's difference between this method and by directly define the 'foo' method in Foo? => maybe the main benifet is that by putting the codes to a Mixin class, the codes can be easily reused.

import aspk_common as AC
class Logging:
    __slots__ = ()
    def __getitem__(self, key):
	print('Getting {}'.format(key))
	print('self: {}\nsuper: {}'.format(self, super()))
	return super().__getitem__(key)

class LoggingDict(Logging, dict):
    pass

d = LoggingDict()
d['x'] = 2
print(d['x'])

mutiple inheritance: how method/attribue are resolved if they exists in more than one super classes

A method/attribute is resolved in the order of all parent class given. e.g: class Foo(A, B) if a method 'aaa' is defined in both A and B, then A.aaa will be used.

python multiple inheritance, super and MRO(method resolution order)

Guoid's words: http://python-history.blogspot.fi/2010/06/method-resolution-order.html depth first, from left to right, then delete all same classes expect the last one. Then diamond problem is solved.

For below code snippets: From the printout, super() will return the next class in MRO(method resolve order) list, given a current class. The next class can be a real parent class for current class, or if the real parent class not exists, then the next class will be the next parent class of the current instance. For both two conditions, they are always the same class in MRO.

For below codes: the MRO is [C, A, B].

  • So super() in class C's result is A
  • super() in class A is B
  • super() in class B is object(I guess)
class A:
    def foo(self):
	print("A")
	print(super())
	super().foo()
class B:
    def foo(self):
	print("B")
	print(super())

class C(A, B):
    def foo(self):
	print("C")
	print(super())
	super().foo()

o = C()
o.foo()
print("MRO of C: ", C.__class__.__mro__)
print("MRO() of C: ", C.__class__.mro(C))

8.19. Implementing Stateful Objects or State Machines

Implementing the state pattern, by creating class for each state. In a class for one state, only define the method use to handle the current state, all other methods should raise a 'NotImplementedError'. Will see this latter

8.20. Calling a Method on an Object Given the Name As a String, by getattr

A method is just an attribute of an object, so first get the method by 'getattr' given string name

class Foo:
    def foo(self):
	print("foo")

f = Foo()
getattr(f, 'foo')()

8.20. Calling a Method on an Object Given the Name As a String, by operator.methodcaller(name, *args)

The benifit of methodcaller is that it will fix all parameters of the method. So if the method will be called given same parameters for many differenntt object, this method might be better

class Foo:
    def foo(self, x, y):
	print("foo: {}, {}".format(x, y))

f = Foo()
import operator
operator.methodcaller('foo', 3, 4)(f)

8.21. Implementing the Visitor Pattern

感觉这里所说的vistor pattern主要是对用于处理包含不同类型对象的list. 用于通用处理。 基于类型系统的visitor pattern, 是通过在不同的基础类中的accept函数来实现 dispatch table的。相当于把dispatch table也耦合在基础类定义中了。 但最本质的目的是对于不同类型的对象,客户代码使用相同的代码进行处理。

将dispatch table 做在哪里,只影响一点点写法,对最终达到的效果没影响。

例子:

class Visitor:
    def visit(self, node):
	methname = 'visit_' + type(node).__name__
	meth = getattr(self, methname, None)
	if meth is None:
	    meth = self.generic_visit
	return meth(node)

    def generic_visit(self, node):
	raise RuntimeError('No {} method'.format('visit_' + type(node).__name__))

class File:
    def __init__(self, name):
	self.name = name

class RegularFile(File):
    def read_content(self):
	return "This is the content for file {}".format(self.name)

class Directory(File):
    def children(self):
	'''Return all children names as a list'''
	return [RegularFile('a.txt'), RegularFile('b.exe')]

class Symbolic(File):
    def real(self):
	'''Return real file this symbolic point to'''
	return RegularFile('dd.txt')

class CatVisitor(Visitor):
    '''Implement cat command for a File object.'''
    def  visit_RegularFile(self, node):
	print('content for regular file {}'.format(node.name))
	print(node.read_content())
    def visit_Directory(self, node):
	print('content for directory {}'.format(node.name))
	for f in node.children():
	    self.visit(f)
    def visit_Symbolic(self, node):
	print('content for symbolic file {}'.format(node.name))
	self.visit(node.real())

files = [RegularFile('foo.txt'), Directory('bar'), RegularFile('a.txt'), Symbolic('aa.c')]
visitor = CatVisitor()
for file in files:
    visitor.visit(file)
    print()

dispatch table in python, decided by object type

将所有处理函数写在一个类中, 提供一个根据待处理对象类型分发的函数。 这个作为dispatch 基类。然后再定义针对每种类型的visit函数就行了。 这里类有两个目的:

  1. 定义dispatch table
  2. 对一组函数的名字空间吧。
  3. 以下例子中实现的 Dispatcher class 是通用的,可以共用。
class Dispatcher:
    def visit(self, node):
	methname = 'visit_' + type(node).__name__
	meth = getattr(self, methname, None)
	if meth is None:
	    meth = self.generic_visit
	return meth(node)

    def generic_visit(self, node):
	raise RuntimeError('No {} method'.format('visit_' + type(node).__name__))

class FooDispatcher(Dispatcher):
    def visit_RegularFile(self, node):
	pass
    def visit_Directory(self, node):
	pass

8.23. Managing Memory in Cyclic Data Structures, by weakref.ref(aobject)

When cyclic reference exists, the some object will never be deleted, because its reference coutns is large than 0. A weakref is just a reference that don't increase the reference count. To dereference, just call it like a function. If the referenced object still exists, the object will be returne, otherwise None will be returned.

For a tree structure, the book give an example of reference the parent node by weakref.

Note: you can't weakref to 'int', 'str', …

import weakref
class Node:
    pass
a = Node()
b = weakref.ref(a)
# c = a     # if this line exists, then a will not be deleted after 'del a', then the second call to b() will still return a

print(b())
del a
print(b())

8.24. Making Classes Support Comparison Operations, by define many comparision builtin method: eq, lt, le, gt, ge, ne

8.25. Creating Cached Instances, by create a factory method(a class method)

If the parameter are the same, then return an existing object.