Python is a widely used language nowadays. You can learn python in 30 minutes and write scripts to tackle a wide range of task from batch processing to deep learning. However, Python itself has a lot of mechanisms that most people don’t know even they used it for years (that’s me). To dive deep and re-learn Python again, I am going to read:
I will share with you some interesting features that I used to ignore. I assume you are semi-pro in Python and want to dig deeper in Python. (This is just a list of GISTS, NOT DETAILS.)
> importing in a class
# PythonForProgrammers/compose.py
class Compose:
from utility import f
Compose().f()
> Static field
The simple definition of static field in python class is dangerous. It only stores the reference. Therefore, if you use inplace functions like `append` or `update`, the static field maintains the same across different instances. However, if you use assign operation like `=`, then rather than changing the value of the static field, the reference of the new value will override the static field of that instance:
In [1]:
class Foo(object):
v1 = "a"
v2 = []
def __repr__(self):
return ("<v1 = {} | v2 = {}>".format(self.v1, self.v2))
In [2]:
f1 = Foo()
f2 = Foo()
f3 = Foo()
In [3]:
f1.v1 = "b"
f1.v2 = [1,2,3]
f1, f2, f3
Out[3]:
(<v1 = b | v2 = [1, 2, 3]>, <v1 = a | v2 = []>, <v1 = a | v2 = []>)
In [4]:
f1.v1 = "b"
f1.v2 = [1,2,3]
f1, f2, f3
Out[4]:
(<v1 = b | v2 = [1, 2, 3]>, <v1 = a | v2 = []>, <v1 = a | v2 = []>)
In [5]:
Foo.v1 = "100"
f1, f2, f3
Out[5]:
(<v1 = b | v2 = [1, 2, 3]>, <v1 = 100 | v2 = []>, <v1 = 100 | v2 = []>)
In [6]:
f2.v2.append(10)
f1, f2, f3
Out[6]:
(<v1 = b | v2 = [1, 2, 3]>, <v1 = 100 | v2 = [10]>, <v1 = 100 | v2 = [10]>)
In [7]:
id(Foo.v1), id(f1.v1), id(f2.v1)
Out[7]:
(139652076653288, 139652244728440, 139652076653288)
Remember to use the class to refer to the static member, aka use `Foo.v1` not `f1.v1`.
> Clean up
I found this part is confusing and recommend you this post instead to get a better understanding of weak reference.
When assigning an instance to another variable like `a=Foo(); b = a`, only the reference is copied and the instance will not be totally removed when we delete it with `del a` because we can still access it via `b`.
A way to avoid that is to use the weak reference, which will prevent `b` from deleting instance `a`. Without that, some hazard situation like “cycle references” might happen.
Cycle reference means the reference of object `a` is stored as a member of object `b`, while the reference of `b` is also stored as a member of object `a`. In that case, deleting both of them will not trigger `__del__()`. This might happen in hierarchical design like (display) tree or double-pointer-link list.
Also, there is a Dead-on-arrival problem where bound method can be referred by a weakref but it actually get destroyed immediately after stored. That is because bound methods are created on demand when accessed on an instance. To store a bound method properly, we should use weakref to refer the method and use a strong reference to refer the class definition, which is used to new a bound method later.
> Unit test
Write the test before code.
White-box test : The test code has complete access to the internals of the class that’s being tested.
Black-box test: Treating the class under test as an impenetrable box.
Use frameworks like Unittest or something that comes with the package (e.g. flask.ext.testing for framework flask)
> Decorators
Used to modify/inject functions. `@` is just a little syntax sugar meaning “pass a function object through another function and assign the result to the original function.” or “applying code to other code” (i.e. macros)
def foo(): pass
foo = staticmethod(foo)
# is equal to
@staticmethod
def foo(): pass
- The `__name__` of the returned function is not as same as the original one, but the one generated by the decorator.
- (DEMO) Decorator without arguments: If we create a decorator without arguments, the function to be decorated is passed to the constructor, and the __call__() method is called whenever the decorated function is invoked.
- (DEMO) Decorator with arguments: Use the passed in arguments to decorate the inner-decorator, and then return the inner-decorator. The inner-decorator or wrapper will decorate the outside function that is truly needed to be decorated.
> Indexing
When indexing, we are actually calling the `__getitem__` function of that object.
- By passing slicing index like `arr[1:10:2]`, we are actually passing a `slice` object initialized with `slice(1, 10, 2)`.
- By passing `arr[…]`, we are passing `Ellipsis` to `__getitem__`, which is a constant as `False` and `True`.
> Metaprogramming
Some basic concepts:
- Class can also be edited as an object.
- It’s worth re-emphasizing that in the vast majority of cases, you don’t need metaclasses
- metaclasses create classes, and classes create instances
- Define a class with
type
:type(name, bases,
dict
)
, where dict is the namespace of the class (fields and methods). - Suppose object a is an instance of class A (i.e. a = A();). Then we have
a.__class__.__class__
equals<class ‘type’>
(metaclass) - Using
type
, we can dynamically define classes.
The main usage of metaclass:
Like type, a metaclass class inherits from type
class, which takes a name, a list of bases, and a dictionary to initialize the class. The book provides a simple example of customizing the metaclass and use it on a class:
# Metaprogramming/SimpleMeta1.py
# Two-step metaclass creation in Python 2.x
class SimpleMeta1(type):
def __init__(cls, name, bases, nmspc):
super(SimpleMeta1, cls).__init__(name, bases, nmspc)
cls.uses_metaclass = lambda self : "Yes!"
class Simple1(object):
__metaclass__ = SimpleMeta1
def foo(self): pass
@staticmethod
def bar(): pass
simple = Simple1()
print([m for m in dir(simple) if not m.startswith('__')])
# A new method has been injected by the metaclass:
print simple.uses_metaclass()
""" Output:
['bar', 'foo', 'uses_metaclass']
Yes!
"""
Also because __metaclass__ only needs to be callable, we can also do inline-definition as in the example. However, those won’t work in Python3 (to be confirmed by the authors), we instead pass __metaclass__ along with the inheriting classes:
class Simple1(object, metaclass = SimpleMeta1):
...
Also, because metaclass method is applied before a class is defined, we can use it to define a final class as in Java:
class final(type):
def __init__(cls, name, bases, namespace):
super(final, cls).__init__(name, bases, namespace)
print("Checking type", name)
for klass in bases:
if isinstance(klass, final):
raise TypeError(str(klass.__name__) + " is final")
class A(metaclass=final): pass
class B(A): pass
When inheriting A, final.__init__ will be executed and since A inherits from final
, we will receive an Error when trying to do that.
__new__
vs __init__
: the first one is applied before the class is created while the second one is applied after the class is created. __slots__ is used to indicate the fixed fields for a class and avoid using dict (dict is very memory consuming). With __new__, we can define __slots__ but not with __init__.
Singleton
> Comprehensions
Comprehensions are constructs that allow sequences to be built from other sequences
Also in Python3 we can do comprehensions in set or dictionary:
a = {i for i in range(10)}
{'{}_plus_one'.format(i):i+1 for i in a}
> for/else
If we want to print whether a positive integer is a prime number in a for loop, we have to use a flag with a break. But with for/else grammar, we can do simplify it as:
x = 11
for i in range(2, x):
if x % i == 0:
print(x, "can be divided by", i)
break
else:
print(x, "is a prime number")
the else statement will be executed if no break appears in the for loop.
> Global Interpreter Lock
A mutex/lock that only allows one thread to control the interpreter. GIL is a lock on the interpreter level to prevent different threads entering the same critical section at the same time. Meanwhile, it can also prevent dead-lock problem since there is only one global lock.
Python chooses this way to ensure the safety and performance of single-threaded programs, which meets a part of Python’s purpose, namely, an easy-to-use script language. Although a lot of complaints have been filed since Python becomes popular, it isn't Easy to Remove the GIL without decreasing the performance of a single-threaded program.
In the previous version of GIL, a thread, after being forced to release it in a fixed interval, is allowed to regain the thread control if no other thread is asking for it. This can block an I/O-bound program when it is parallelized with a CPU-bound program. Visualization can be found in here. The fix in Python 3.2 makes GIL competition fairer.
Workarounds: Multi-thread programming; use other interpreter rather than CPython; wait for the update in Python 3.x.
Restore-point; http://python-3-patterns-idioms-test.readthedocs.io/en/latest/PatternConcept.html#design-principles