Fundamentals 19 min read

Unlock the Power of Python Descriptors: Master Attribute Control and Validation

This article provides a comprehensive guide to Python descriptors, explaining their protocol, types, basic usage, real‑world applications such as type‑checking, lazy loading, observer patterns, ORM‑like fields, and best‑practice tips for safe and efficient attribute management.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Unlock the Power of Python Descriptors: Master Attribute Control and Validation

1. What Is a Descriptor?

A descriptor is an object that implements any of the special methods __get__, __set__, or __delete__ and can control access to attributes of other objects.

1.1 Descriptor Protocol

A complete descriptor must implement one or more of the following methods:

class Descriptor:
    def __get__(self, instance, owner):
        """Called when the attribute is accessed"""
        pass

    def __set__(self, instance, value):
        """Called when the attribute is set"""
        pass

    def __delete__(self, instance):
        """Called when the attribute is deleted"""
        pass

1.2 Types of Descriptors

Data descriptor : implements __set__ and/or __delete__.

Non‑data descriptor : implements only __get__.

2. Basic Usage of Descriptors

2.1 Simple Read‑Only Descriptor

class ReadOnlyDescriptor:
    """Read‑only descriptor example"""
    def __init__(self, initial_value):
        self._value = initial_value

    def __get__(self, instance, owner):
        return self._value

    def __set__(self, instance, value):
        raise AttributeError("Read‑only attribute cannot be modified")

class MyClass:
    read_only_attr = ReadOnlyDescriptor("initial value")

obj = MyClass()
print(obj.read_only_attr)  # -> initial value
# obj.read_only_attr = "new"  # raises AttributeError

2.2 Full Read‑Write Descriptor

class SimpleDescriptor:
    """Simple read‑write descriptor"""
    def __init__(self, name):
        self.name = name
        self._values = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._values.get(id(instance))

    def __set__(self, instance, value):
        self._values[id(instance)] = value

    def __delete__(self, instance):
        if id(instance) in self._values:
            del self._values[id(instance)]

class Person:
    name = SimpleDescriptor("name")
    age = SimpleDescriptor("age")

p1 = Person()
p1.name = "Alice"
p1.age = 25
p2 = Person()
p2.name = "Bob"
p2.age = 30
print(p1.name, p1.age)  # Alice 25
print(p2.name, p2.age)  # Bob 30

3. Practical Application Scenarios

3.1 Type‑Checking Descriptor

class TypedDescriptor:
    """Descriptor that enforces a specific type"""
    def __init__(self, name, expected_type):
        self.name = name
        self.expected_type = expected_type
        self._values = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._values.get(id(instance))

    def __set__(self, instance, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(f"{self.name} must be of type {self.expected_type.__name__}")
        self._values[id(instance)] = value

    def __delete__(self, instance):
        if id(instance) in self._values:
            del self._values[id(instance)]

class User:
    name = TypedDescriptor("name", str)
    age = TypedDescriptor("age", int)
    score = TypedDescriptor("score", (int, float))

user = User()
user.name = "Alice"      # OK
user.age = 25            # OK
user.score = 95.5        # OK
# user.age = "twenty"   # raises TypeError

3.2 Lazy‑Loading Descriptor

import time

class LazyLoadDescriptor:
    """Descriptor that computes the value on first access and caches it"""
    def __init__(self, func):
        self.func = func
        self._cache = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        instance_id = id(instance)
        if instance_id not in self._cache:
            print(f"Computing {self.func.__name__} ...")
            time.sleep(1)  # simulate expensive computation
            self._cache[instance_id] = self.func(instance)
        return self._cache[instance_id]

class DataProcessor:
    def __init__(self, data):
        self.data = data

    @LazyLoadDescriptor
    def processed_data(self):
        """Simulate an expensive data processing step"""
        return [x * 2 for x in self.data]

processor = DataProcessor([1, 2, 3, 4, 5])
print("First access:")
result1 = processor.processed_data  # triggers computation
print("Second access:")
result2 = processor.processed_data  # returns cached value
print("Results are the same:", result1 is result2)

3.3 Observer‑Pattern Descriptor

class ObservableDescriptor:
    """Descriptor that notifies observers when the value changes"""
    def __init__(self, name):
        self.name = name
        self._values = {}
        self._observers = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._values.get(id(instance))

    def __set__(self, instance, value):
        old_value = self._values.get(id(instance))
        self._values[id(instance)] = value
        for callback in self._observers.get(id(instance), []):
            callback(old_value, value)

    def add_observer(self, instance, callback):
        self._observers.setdefault(id(instance), []).append(callback)

    def remove_observer(self, instance, callback):
        if id(instance) in self._observers:
            self._observers[id(instance)].remove(callback)
)

class Settings:
    theme = ObservableDescriptor("theme")
    language = ObservableDescriptor("language")

    def theme_changed(old, new):
        print(f"Theme changed from {old} to {new}")

    def language_changed(old, new):
        print(f"Language changed from {old} to {new}")

settings = Settings()
settings.theme.add_observer(settings, Settings.theme_changed)
settings.language.add_observer(settings, Settings.language_changed)
settings.theme = "dark"      # triggers theme_changed
settings.language = "zh"    # triggers language_changed

4. Descriptors in Framework‑Like Scenarios

4.1 ORM‑Style Field Descriptor

class ORMField:
    """Descriptor that mimics a Django‑style ORM field"""
    def __init__(self, field_type, default=None, null=False):
        self.field_type = field_type
        self.default = default
        self.null = null
        self._values = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        instance_id = id(instance)
        if instance_id not in self._values:
            if self.default is not None:
                return self.default
            if not self.null:
                raise ValueError("This field cannot be null")
            return None
        return self._values[instance_id]

    def __set__(self, instance, value):
        if value is None and not self.null:
            raise ValueError("This field cannot be null")
        if not isinstance(value, self.field_type):
            raise TypeError(f"Value must be of type {self.field_type.__name__}")
        self._values[id(instance)] = value

class Model:
    def __init__(self, **kwargs):
        for field_name in self._get_fields():
            value = kwargs.get(field_name)
            if value is not None:
                setattr(self, field_name, value)

    def _get_fields(self):
        return [name for name in dir(self.__class__) if isinstance(getattr(self.__class__, name), ORMField)]

    def __repr__(self):
        fields = self._get_fields()
        values = {f: getattr(self, f) for f in fields}
        return f"{self.__class__.__name__}({values})"

class User(Model):
    name = ORMField(str, null=False)
    age = ORMField(int, default=0)
    email = ORMField(str, null=True)

u1 = User(name="Alice", age=25)
print(u1)  # User({'name': 'Alice', 'age': 25, 'email': None})

4.2 Property‑Access Control Descriptor

class ProtectedDescriptor:
    """Read‑only or write‑protected attribute descriptor"""
    def __init__(self, name, read_only=False):
        self.name = name
        self.read_only = read_only
        self._values = {}

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._values.get(id(instance))

    def __set__(self, instance, value):
        if self.read_only:
            raise AttributeError(f"{self.name} is read‑only")
        self._values[id(instance)] = value

    def __delete__(self, instance):
        raise AttributeError(f"Cannot delete {self.name}")

class Config:
    api_key = ProtectedDescriptor("api_key", read_only=True)
    timeout = ProtectedDescriptor("timeout")

cfg = Config()
cfg.timeout = 30
# Config.api_key._values[id(cfg)] = "secret"  # set internally
print(cfg.timeout)  # 30
# cfg.api_key = "new"  # raises AttributeError

5. Descriptor Lookup Order

Understanding Python’s attribute‑lookup order is essential when working with descriptors:

1. Data descriptor (highest priority)

2. Instance attribute

3. Non‑data descriptor

4. Class attribute

5. __getattr__ (if defined)

class Demo:
    def __init__(self):
        self.instance_attr = "instance value"

    @property
    def computed_attr(self):
        return "computed"

    class_attr = "class attribute"

class DataDescriptor:
    def __get__(self, instance, owner):
        return "data descriptor"
    def __set__(self, instance, value):
        pass

class NonDataDescriptor:
    def __get__(self, instance, owner):
        return "non‑data descriptor"

class TestClass:
    data_desc = DataDescriptor()
    non_data_desc = NonDataDescriptor()

obj = TestClass()
obj.instance_attr = "instance attribute value"
print(obj.data_desc)        # data descriptor
print(obj.instance_attr)    # instance attribute value
print(obj.non_data_desc)    # non‑data descriptor

6. Best Practices and Pitfalls

6.1 Use weakref to Avoid Memory Leaks

import weakref

class SafeDescriptor:
    """Descriptor that stores values in a WeakKeyDictionary"""
    def __init__(self):
        self._data = weakref.WeakKeyDictionary()

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._data.get(instance)

    def __set__(self, instance, value):
        self._data[instance] = value

class MyClass:
    attr = SafeDescriptor()

obj = MyClass()
obj.attr = "value"
print(obj.attr)  # value
# When obj is garbage‑collected, the entry disappears automatically.

6.2 Debugging Descriptors

class DebugDescriptor:
    def __init__(self, name):
        self.name = name
        self._values = {}

    def __get__(self, instance, owner):
        print(f"Getting {self.name}")
        if instance is None:
            return self
        return self._values.get(id(instance))

    def __set__(self, instance, value):
        print(f"Setting {self.name} = {value}")
        self._values[id(instance)] = value

class DebugClass:
    attr = DebugDescriptor("attr")

obj = DebugClass()
obj.attr = "test"   # prints Setting attr = test
value = obj.attr      # prints Getting attr

6.3 Descriptors and Inheritance

class BaseDescriptor:
    def __init__(self, name):
        self.name = name
        self._values = {}
    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._values.get(id(instance), f"default {self.name}")
    def __set__(self, instance, value):
        self._values[id(instance)] = value

class BaseClass:
    attr = BaseDescriptor("attr")

class ChildClass(BaseClass):
    pass

b = BaseClass()
c = ChildClass()
 b.attr = "base value"
 c.attr = "child value"
print(b.attr)  # base value
print(c.attr)  # child value

7. Combining Descriptors with Decorators

from functools import wraps

class MethodDescriptor:
    """Descriptor that turns a plain function into a bound method"""
    def __init__(self, func):
        self.func = func
        wraps(func)(self)

    def __get__(self, instance, owner):
        if instance is None:
            return self.func
        return self.func.__get__(instance, owner)

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)

def method_decorator(func):
    """Convert a function into a MethodDescriptor"""
    return MethodDescriptor(func)

class MyClass:
    @method_decorator
    def my_method(self):
        return "method called"

obj = MyClass()
print(obj.my_method())  # method called

8. Summary

Descriptors are a powerful feature of Python that provide fine‑grained control over attribute access. They enable type validation, lazy loading, observer patterns, and can be used to build framework‑level functionality such as ORM fields. When used wisely, they make code more robust and expressive.

Key Advantages

✅ Precise control of attribute get, set, and delete operations

✅ Built‑in type validation and constraints

✅ Support for lazy computation and caching

✅ Enable observer‑pattern notifications

✅ Useful for building framework‑level features (e.g., ORM)

Typical Use Cases

🔹 Enforcing or converting attribute values

🔹 Delaying expensive calculations or caching results

🔹 Observing attribute changes

🔹 Implementing DSLs or framework internals

Things to Watch Out For

⚠️ Avoid memory leaks by using weakref when storing per‑instance data

⚠️ Understand Python’s attribute‑lookup order

⚠️ Do not over‑engineer; keep code readable

Image
Image
Image
Image
PythonLazy LoadingAttribute AccessDescriptorType Validation
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.