Learn about Descriptors in Python


Reading time: 20 minutes | Coding time: 5 minutes

Have you seen this code or maybe have written code like this?

from sqlalchemy import Column, Integer, String
class User(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String)

This code snippet partially comes from the tutorial of a popular ORM package called SQLAlchemy. If you ever wonder why the attributes id and name aren't passed into the __init__ method and bind to the instance like regular class does.This post will clear your doubts.

What are descriptors?

Python descriptor protocol is simply a way to specify what happens when an attribute is referenced on a model. It allows a programmer to easily and efficiently manage attribute access:

  • set
  • get
  • delete

In other programming languages, descriptors are referred to as setter and getter, where public functions are used to Get and Set a private variable. Python doesn’t have a private variables concept, and descriptor protocol can be considered as a Pythonic way to achieve something similar.

In general, a descriptor is an object attribute with a binding behavior, one whose attribute access is overridden by methods in the descriptor protocol. Those methods are __get__,__set__, and __delete__. If any of these methods are defined for an object, it is said to be a descriptor.

Descriptor methods

  • get(self, instance, owner)
  • set(self, instance, value)
  • delete(self, instance)

Where:

__get__ accesses the attribute. It returns the value of the attribute, or raise the AttributeError exception if a requested attribute is not present.

__set__ is called in an attribute assignment operation. Returns nothing.

__delete__ controls a delete operation. Returns nothing.

It is important to note that descriptors are assigned to a class, not an instance. Modifying the class overwrites or deletes the descriptor itself, rather than triggering its code.

Why you want to use descriptors?

Let’s see an example:

class Order:
    def __init__(self, name, price, quantity):
        self.name = name
        self.price = price
        self.quantity = quantity
    def total(self):
        return self.price * self.quantity
apple_order = Order('apple', 1, 10)
apple_order.total()
# 10

Despite the lack of proper documentation, there is a bug:

apple_order.quantity = -10
apple_order.total
# -10, too good of a deal!

Instead of using getter and setter methods and break the APIs, to enforce the quantity to be positive we can use the property methods:

class Order:
    def __init__(self, name, price, quantity):
        self._name = name
        self.price = price
        self._quantity = quantity  # (1)
   
   @property
    def quantity(self):
        return self._quantity
    
    @quantity.setter
    def quantity(self, value):
        if value < 0:
            raise ValueError('Cannot be negative.')
        self._quantity = value  # (2)
        
apple_order.quantity = -10
# ValueError: Cannot be negative

We transformed quantity from a simple attribute to a non-negative property. Notice line (1) that the attribute are renamed to _quantity to avoid line (2) getting a RecursionError.

We forgot about the price attribute cannot be negative neither. It might be attempting to just create another property for price, but remember the DRY principle: when you find yourself doing the same thing twice, it's a good sign to extract the reusable code. Also, in our example, there might be more attributes need to be added into this class in the future. Repeating the code isn't fun for the writer or the reader. Let's see how to use descriptors to help us.

How to write the descriptors

With the descriptors in place, our new class definition would become:

class Order:
    price = NonNegative('price')  # (3)
    quantity = NonNegative('quantity')
    def __init__(self, name, price, quantity):
        self._name = name
        self.price = price
        self.quantity = quantity
    def total(self):
        return self.price * self.quantity
apple_order = Order('apple', 1, 10)
apple_order.total()
# 10
apple_order.price = -10
# ValueError: Cannot be negative
apple_order.quantity = -10
# ValueError: Cannot be negative

Notice the class attributes defined before the __init__ method? It's a lot like the SQLAlchemy example showed on the very beginning of this post. This is where we are heading. We need to define the NonNegative class and implement the descriptor protocols. Here's how:

class NonNegative:
    def __init__(self, name):
        self.name = name  # (4)
    def __get__(self, instance, owner):
        return instance.__dict__[self.name]  # (5)
    def __set__(self, instance, value):
        if value < 0:
            raise ValueError('Cannot be negative.')
        instance.__dict__[self.name] = value  # (6)

Line (4): the name attribute is needed because when the NonNegative object is created on line (3), the assignment to attribute named price hasn't happen yet. Thus, we need to explicitly pass the name price to the initializer of the object to use as the key for the instance's __dict__.

Conclusion

Python descriptors allow for powerful and flexible attribute management with new style classes. Combined with decorators, they make for elegant programming, allowing creation of Setters and Getters, as well as read-only attributes. It also allows you to run attribute validation on request by value or type. You can apply descriptors in many areas, but use them with discretion to avoid unnecessary code complexity stemming from overriding the normal behaviors of an object.