Python metaclasses
31 Jul 2015Same way as classes control instance creation and let us define instance behaviour in the form of instance methods and magic methods, metaclasses in Python can do all that and a little more for class objects. The simplest way to deal with metaclasses is to think of them as of classes of classes.
One of the most commonly used metaclasses is type
since it’s a default metaclass for all classes, and all other metaclasses have to derive from it.
Contents
- Type of a type
- Attaching metaclasses
- Intercepting type constructors
- Metaclass as a registry
- Singleton pattern
- Generating descriptors
Type of a type
In Python, everything is an object and every object needs to have a type. In a nutshell, a type is an entity that knows how to create instances, with optional bells and whistles on top. For instance, the type of 1
is int
and the type of int
is type
:
There’s the types
module in the standard library that provides a few functions that manipulate types and defines the standard types used by the interpreter. This comes in handy, say, if you need to check whether a given object is a function or a module:
Now, since type
is also an object, what about its type? It turns out that type
’s type is the type
itself, so that’s as far up the type ladder as it goes:
The type of all new-style user-defined classes is also type
:
You sort of get that for free in Python 3, but in Python 2 you have to remember to derive all your classes from object
, otherwise they end up being of classobj
type:
Finally, when called with three arguments, type
acts like a constructor, so you can create new types in an inline fashion. The arguments are: the name of the type, the tuple of base classes and the class dict (which contains everything you would normally put in the class body):
We’ll get back to this a bit later when we look at the metaclass constructor.
Attaching metaclasses
Since all metaclasses inherit from type
, the simplest implementation of a metaclass looks like this:
Any class whose metaclass is M
will now be of type M
(but also an instance of type
since M
derives from type
).
How do we attach the metaclass to a class? Alas, this is yet another thing that’s done differently in Python 2 and Python 3.
In Python 2, the metaclass is attached by setting a special field in the class body:
In Python 3, however, the metaclass has to be specified as a keyword argument in the base class list:
One of the ways to get around this is to use the six
compatibility library which provides a unified way of attaching metaclasses either through a decorator or a special base class:
Intercepting type constructors
As we’ve already seen above, type
’s constructor signature looks like this:
Defining classes using the class
keyword can be just viewed as a syntactic sugar for calling type
’s constructor directly (for the most part, anyway).
By using type
directly, the example above could be rewritten like so:
All the real magic is happening in the constructor of type
, and user-defined metaclasses can alter the arguments that it receives. In other words, you can intercept class creation right after the class definition was parsed by interpreter and right before it gets passed on to type
. This can be done by overriding the __new__
method.
Let’s look at a specific example where we want all subclasses of a certain class to be assigned automatically incremented id
numbers only if the track
field is set to true in the class body; we also want to store all tracked classes in a shared class-level classes
field.
First we define the metaclass Meta
as a subclass of type
that overrides its __new__
method and contains a class-level variable classes
that encapsulates the current state.
The signature of __new__
method matches the signature of type
’s constructor – it receives the class name, a tuple of base types and a class dict, and it must return a fully instantiated type. In this method, we check if the class dict has a track
field set to a true value and assign a class-level id
accordingly if it does; elsewise, it is set to None
.
Finally, we call the the constructor of the base metaclass to instantiate the type, optionally store it in the list of tracked classes, and then return it. It’s worth mentioning that we could have used type.__new__
instead of super(Meta, meta).__new__
but it’s generally considered a good habit to avoid hard-coding the base type which may sometimes lead to unexpected inheritance behaviour.
Note that all the class-level variables in the metaclass such as classes
in the example above can also be accessed by the underlying classes and instances. Same way, instance methods of the metaclass are available as class methods in both classes and instances.
Metaclass as a registry
One of the typical use cases for metaclasses is to keep track of the created classes in order to be able to access them at runtime by name or identifier.
In this example, any subclass of Base
(whether direct or indirect) will be tracked by the Registry
metaclass, and can be retrieved at a later time via the class name or the alias
field if it’s provided.
We didn’t want the Base
class itself to appear in the class registry, but rather its strict descendants only – so we check whether the class body has __base__
field defined, and don’t register it with the metaclass if it does.
In order to make the Registry
metaclass subscriptable, we’ve had to attach RegistryMeta
metaclass to it in which we’ve implemented __getitem__
magic method. As scary as “metaclass of a metaclass” may sound, it follows the same logic as if you were implementing a magic method for instances in the class body – if you need the instance (which also happens to be a metaclass in our case) to do something fancy, you implement it in its type so the type knows how to attach that fancy functionality to the object at creation time.
To summarize the type/metaclass hierarchy in the example above:
Singleton pattern
Using global “god objects” is generally a bad idea, but if you absolutely have to – you may want to make sure it is actually a singleton. That is, only one unique instance of the class can exist at a time, and the class cannot be inherited from (or, in Java terms, it is final).
As we’ve already seen, if a metaclass is attached to the base class, every time the base class is directly or indirectly subclassed the __new__
method of the metaclass will be called. This lets us easily handle the no-inheritance part. Well, almost, since we could subclass the metaclass itself, attach it to a new base type and derive that class from the original singleton. If for some reason you would want to prevent that from happening, you could implement a metaclass for the metaclass in order to make the metaclass “final”. However, then one could subclass the metaclass’s metaclass… you get the point.
When it comes to controlling instance creation from the metaclass, __new__
is no longer sufficient – we need to override __call__
instead. An easy way to remember this: __new__
is triggered when a new class is created; __call__
is triggered when a new instance is created via the call-like syntax on the class object. This works the same way as when __new__
and __call__
magic methods are implemented on normal classes, with the former controlling instance creation and the latter making instances callable.
To see how these methods work together, take a look at the following snippet:
The output looks like this:
Back to our singleton class, we’ll need to override __call__
in the metaclass to intercept the instance constructor and return the existing instance which can be stored in the class itself. If this instance doesn’t exist yet, we can create it by calling super
which in its turn will call the class constructor if it’s defined.
The implementation is quite simple and transparent, although there’s surely some room for improvement – for instance, __call__
method only cares about the arguments it receives the very first time it’s called; every other time it is just going to return the cached instance unconditionally.
Generating descriptors
Another common use of metaclasses is to automate creation of descriptors. In Python, descriptor is any object that implements at least one of the following methods: __get__
, __set__
or __delete__
.
Descriptors control attribute access, and the default behaviour is to get, set or delete an attribute from object’s dictionary. Here’s a naïve implementation:
This may seem like an overkill as we could have just done away with instance variables here. However, descriptors come in handy when we want to inject any additional logic into get, set or delete methods. For example, it is entirely possible to reimplement property
, classmethod
and staticmethod
builtins using field descriptors.
Why did we have to pass string'x'
to Descriptor
explicitly? The reason for this is quite simple – on the one hand, the descriptor needs to know the name of the attribute it is bound to so it can find it in the object’s __dict__
, but at the moment of instantiation it cannot reference the class dict because it doesn’t exist yet! In a way, it’s sort of the same as when we assign a = A()
and there is no way of telling A
that it is going to be assigned to anything, let alone deducing the variable name.
Here’s where the metaclasses come in – right before the type is created, we have the full class dict at our disposal, so we can replace certain fields with named descriptors if we want to.
In the next example, we will implement a base class whose subclasses can use special syntax to generate typed attributes with default values. Given a class that is defined like so:
we want x
to be an integer attribute with default value of None
, and y
a string attribute with default value of 'foo'
. When assigned to, both fields should try to coerce the given values to int
and str
, respectively.
Here is one possible implementation.
Note that we still had to pass field names to Descriptor
, but this time it was done automatically by the metaclass, with names simply being equal to the keys in the class dict. This is one of the points of using metaclasses in the first place – do all of the hard work in the metaclass so the user code can be simplified.
If you ever wondered how the fancy frameworks like sqlalchemy
or django
are implemented, where schema definitions look very concise way but there’s a lot going under the hood, this toy example could be a good starting point.