Escaping the template pattern hellscape in Python

Over the years, I’ve used the template pattern¹ across multiple OO languages with varying degrees of success. It was one of the first patterns I learned in the primordial hours of my software engineering career, and for some reason, it just feels like the natural way to tackle many real-world code-sharing problems. Yet, even before I jumped on board with the composition over inheritance² camp, I couldn’t help but notice how using this particular inheritance technique spawns all sorts of design and maintenance headaches as the codebase starts to grow.

An epiphany

This isn’t an attempt to explain why you should prefer composition over inheritance (although you should), as it’s a vast topic and much has been said regarding this. Also, only after a few years of reading concomitant literatures and making enough mistakes in real-life codebases, it dawned on me that opting for inheritance as the default option leads to a fragile design. So I won’t even attempt to tackle that in a single post and will refer to a few fantastic prior arts that proselytized me to the composition cult.

The goal of this article is not to focus on the wider spectrum of how to transform subclass-based APIs to use composition but rather to zoom in specifically on the template pattern and propose an alternative way to solve a problem where this pattern most naturally manifests. In the first portion, the post will explain what the template pattern is and how it gradually leads to an intractable mess as the code grows. In the latter segments, I’ll demonstrate how I’ve designed a real-world service by adopting the obvious and natural path of inheritance-driven architecture. Then, I’ll explain how the service can be refactored to escape the quagmire that I’ve now started to refer to as the template pattern hellscape.

Only a few moons ago, while watching Hynek Schlawack’s Python 2023 talk aptly titled “Subclassing, Composition, Python, and You”³ and reading his fantastic blog post “Subclassing in Python Redux”⁴, the concept of adopting composition to gradually phase out subclass-oriented design from my code finally clicked for me. However, it’s not always obvious to me how to locate inheritance metastasis and exactly where to intervene to make the design better. This post is my attempt to distill some of my learning from those resources and focus on improving only a small part of the gamut.

The infectious template pattern

You’re consciously or subconsciously implementing the template pattern when your API design follows these steps:

You have an Abstract Base Class (ABC) with abstract methods.
The ABC also includes one or more concrete methods.
The concrete methods in the ABC depend on the concrete implementation of the abstract methods.
API users are expected to inherit from the ABC and provide concrete implementations for the abstract methods.
Users then utilize the concrete methods defined in the ABC class.

This pattern enables the sharing of concrete method implementations with subclasses. However, the concrete methods of the baseclass are only valid when the user inherits from the base and implements the abstract methods. Attempting to instantiate the baseclass without implementing the abstract methods will result in a TypeError. Only the subclass can be initialized once all the abstract methods have been implemented.

Observe this example:

from abc import ABC, abstractmethod


class Base(ABC):
    def concrete_method(self) -> None:
        # This depends on abstract_method. The user is expected to create
        # a subclass from Base and implement abstract_method.
        return self.abstract_method()

    @abstractmethod
    def abstract_method(self) -> None:
        raise NotImplementedError


class Sub(Base):
    def abstract_method(self) -> None:
        """Providing a concrete implementation for the 'abstract_method' from
        the Base class."""
        print(
            "I'm a concrete implementation of the 'abstract_method' of Base."
        )

This is how you use it:

sub = Sub()

# Notice how we're only using the 'concrete_method' defined in the Base class
sub.concrete_method()

Here, the abstract Base class is defined by inheriting from the abc.ABC class. Inside Base, there’s a concrete_method that relies on an abstract_method. The concrete_method is defined to call abstract_method, expecting that subclasses will provide their own implementation of abstract_method. If a subclass of Base fails to implement abstract_method, calling concrete_method on an instance of that subclass will raise a NotImplementedError.

The snippet also provides an example subclass called Sub, which inherits from Base. Sub overrides the abstract_method and provides its own implementation. In this case, it just prints a statement. By subclassing Base and implementing abstract_method, Sub becomes a concrete class that can be instantiated. The purpose of this design pattern is to define a common interface through the Base class, with the expectation that subclasses will implement specific behavior by overriding the abstract methods, while still providing a way to call those methods through the concrete methods defined in the baseclass. This seemingly innocuous and often convenient bi-directional relationship between the base and sub class tends to become infectious and introduces complexity into all the subclasses that inherit from the base.

The dark side of the moon

Template pattern seems like the obvious way of sharing code and it almost always is one of the first things that people learn while familiarizing themselves with how OO works in Python. Plus, it’s used extensively in the standard library. For example, in the collections.abc module, there are a few ABCs that you can subclass to build your own containers. I wrote about this⁵ a few years back. Here’s how you can subclass collections.abc.Sequence to implement a tuple-like immutable datastructure:

from typing import Any
from collections.abc import Sequence


class CustomSequence(Sequence):
    def __init__(self, *args: Any) -> None:
        self._data = list(args)

    def __getitem__(self, index: int) -> Any:
        return self._data[index]

    def __len__(self) -> int:
        return len(self._data)

You’d use the class as such:

seq = CustomSequence(1, 2, 3, 4)
assert seq[0] == 1
assert len(seq) == 4

We’re inheriting from the Sequence ABC and implementing the required abstract methods. Here’s the first issue: how do we even know which methods to implement and which methods we get for free? You can consult the documentation⁶ and learn that __getitem__ and __len__ are the abstract methods that subclasses are expected to implement. In return, the base Sequence class gives you __contains__, __iter__, __reversed__, index, and count as mixin methods. You can also print out the abstract methods by accessing the Sequence.__abstractmethod__ attribute. Sure, you’re getting a lot of concrete methods for free, but suddenly you’re dependent on some out-of-band information to learn about the behavior of your specialized CustomSequence class.

The following three sections will briefly explore the issues that deceptively creep up on your codebase when you opt for the template pattern.

Elusive public API

You’ve already seen the manifestation of this issue in the CustomSequence example. The subclass-oriented code-sharing pattern like this makes it difficult to discover the public API of your specialized class because many of its functionalities come from the concrete mixin methods provided by the base Sequence class. Now, this isn’t too terrible for a tool in the standard library since they’re usually quite well-documented, and you can always resort to inspecting the subclass instance to learn about the abstract and concrete methods.

Not all subclass-driven design is bad, and the standard library makes judicious use of the template pattern. However, in an application codebase that you might be writing, this elusive nature of the public API can start becoming recalcitrant. Your code may not be as well-documented as the standard library, or instantiating the subclass may be expensive, making introspection difficult. You’re basically trading off readability for writing ergonomics. There’s nothing wrong in doing that as long as you’re aware of the tradeoffs. All I’m trying to say is that it’s a non-ideal default.

Namespace pollution

If you introspect the previously defined subclass with dir(CustomSequence), you’ll get the following result. I’ve removed the common attributes that every class inherits from object for brevity and annotated the abstract and mixin method names for clarity.

[
    "__abstractmethods__",  # Allows you to list out the abstract methods
    "__class_getitem__",  # Used for generic typing
    "__contains__",  # Provided by the base
    "__getitem__",  # You implement
    "__iter__",  # Provided by the base
    "__len__",  # You implement
    "__reversed__",  # Provided by the base
    "count",  # Provided by the base
    "index",  # Provided by the base
]

From the above list, it’s evident that all the methods from the baseclass and the subclass live in the same namespace. The moment you’re inheriting from some baseclass, you have no control over what that class is bringing over to your subclass’s namespace and effectively polluting it. It’s like a more sneaky version of from foo import *.

This flat namespacing makes it hard to understand which method is coming from where. In the above case, without the annotations, you’d have a hard time discerning between the methods that you implemented and the alien methods from the baseclass. This isn’t a cardinal sin in the Python realm if that’s what you want, but it’s certainly a suboptimal default.

SRP violation & goose chase program flow

The biggest complaint I have against the template pattern is how it encourages the baseclass to do too many things at once. I can endure poor discoverability of public APIs and namespace pollution to some extent, but when a class tries to do too many things simultaneously, it eventually exhibits the tendency to give birth to God Objects⁷; breaching the SRP (Single Responsibility Principle).

Intentionally violating the SRP rarely fosters good results, and in this case, the baseclass defines both concrete and abstract methods. Not only that, the base expects the subclasses to implement those abstract methods so that it can use them in its concrete method implementation. Just reading back this sentence is giving me a headache. If you design your APIs in this manner, you’ll have to carefully read through both the sub and the base class implementations to understand how this intricate bi-directional thread is woven into your program flow. This seems easy enough in a simple example where you can see both the base and the sub class in a single snippet, but it quickly gets out of hand when large base and sub classes are scattered across multiple modules. You’ll need to perform the mental gymnastics of tracking this back-and-forth logic, aka the abominable goose chase program flow.

The disease and the cure

Let’s examine a specific design problem and observe how it can be modeled using the template pattern. Then, we’ll explore an alternative solution that replaces the inheritance-driven design with composition.

Designing with template pattern

The following code snippet mimics a real-world webhook⁸ dispatcher that takes a message and posts it to a callback URL via HTTP POST request. First, we’ll commit the cardinal sin of modeling the domain with the template pattern and then we’ll try to find a way out of the quandary. Here it goes:

from dataclasses import dataclass, field, asdict
from uuid import uuid4
from abc import ABC, abstractmethod


@dataclass(frozen=True)
class Message:
    ref: str = field(default_factory=lambda: str(uuid4()))
    body: str = ""


class BaseWebhook(ABC):
    def send(self) -> None:
        url = self.get_url()
        data = self.get_message()
        print(f"sending {data} to {url}")

    @abstractmethod
    def get_message(self) -> dict[str, str]:
        raise NotImplementedError

    @abstractmethod
    def get_url(self) -> str:
        raise NotImplementedError


class Webhook(BaseWebhook):
    def __init__(self, message: Message) -> None:
        self.message = message

    def get_message(self) -> dict[str, str]:
        # Assume that we're doing other side effects and adding more data in
        # runtime
        return asdict(self.message)

    def get_url(self) -> str:
        return "https://webhook.site/foo"

Here’s how you’ll orchestrate the classes:

message = Message(body="Hello World")
webhook = Webhook(message)

# This just prints:
# sending
# {
#   'ref': '4635cfe0-825e-4f40-9c7b-04275b1c809e',
#   'body': 'Hello World'
# } to https://webhook.site/foo
webhook.send()

We start by defining an immutable Message container to store our webhook message. Next, we write an abstract BaseWebhook class that inherits from abc.ABC. This class serves as a template for the webhook functionality and declares two abstract methods: get_message() and get_url(). Type annotations are used to indicate the return types of these methods. Any subclasses derived from BaseWebhook must implement these abstract methods. The send() method, implemented in the baseclass, uses the concrete implementations of the abstract methods to perform webhook dispatching. In this case, we simulate the HTTP POST functionality by printing the message and destination URL.

The Webhook class is a subclass of BaseWebhook and provides concrete implementations of the abstract methods. It accepts a single Message object as a parameter in its constructor. The get_url() method returns a fixed URL, while the get_message() method converts the Message object into a serializable dictionary representation using dataclasses.asdict().

In this structure, the user of the Webhook class only needs to initialize the class and call the send() method on the instance. The send() method, however, lives in the BaseWebhook class, not the specialized Webhook subclass. It utilizes the concrete implementations of abstract methods to deliver the send() functionality. In the following section, we’ll explore a method to avoid this weird back-and-forth program flow.

Finding salvation in strategy pattern

There are multiple ways and conflicting opinions on how to get out of the hole we’ve dug for ourselves. Some even like to spend more time prattling around the philosophy of how OO is terrible and how, if it weren’t for Java’s huge influence on Python, we wouldn’t be in this mess, rather than attempting to solve the actual problem. So instead of trying to cover every possible solution under the sun, I’ll go through the one that has worked for me fairly well.

We’ll refactor the code in the previous section to take advantage of composition and structural subtyping⁹ support in Python. Long story short, structural subtyping refers to the ability to ensure type safety based on the structure or shape of an object rather than its explicit inheritance hierarchy. This allows us to define and enforce contracts based on the presence of specific attributes or methods, rather than relying on a specific class or inheritance relationship.

This is achieved through the use of the typing.Protocol class introduced in Python 3.8. By defining a protocol using the typing.Protocol class, we can specify the expected attributes and methods that an object should have to satisfy the protocol. Any object that matches the structure defined by the protocol can be treated as if it conforms to that protocol, enabling more flexible and dynamic type-checking in Python. This conformity is usually checked by a type-checking tool like mypy¹⁰. If you want to learn more, check out Glyph’s post titled “I Want a New Duck”¹¹. Here’s how I refactored it:

from dataclasses import dataclass, field, asdict
from uuid import uuid4
from typing import Protocol


@dataclass(frozen=True)
class Message:
    ref: str = field(default_factory=lambda: str(uuid4()))
    body: str = ""


class Retriever(Protocol):
    def get_message(self, message: Message) -> dict[str, str]: ...

    def get_url(self) -> str: ...


class Dispatcher(Protocol):
    def dispatch(self, url: str, data: dict[str, str]) -> None: ...


class HookRetriever:
    def get_message(self, message: Message) -> dict[str, str]:
        # Assume that we're doing other side effects and adding more data in
        # runtime
        return asdict(message)

    def get_url(self) -> str:
        return "https://webhook.site/foo"


class HookDispatcher:
    def dispatch(self, url: str, data: dict[str, str]) -> None:
        print(f"Sending {data} to {url}")


@dataclass
class Webhook:
    message: Message
    retriever: Retriever
    dispatcher: Dispatcher

    def send(self) -> None:
        url = self.retriever.get_url()
        data = self.retriever.get_message(self.message)
        return self.dispatcher.dispatch(url, data)

The classes can be wired together as follows:

message = Message(body="Hello World")
retriever = HookRetriever()
dispatcher = HookDispatcher()

webhook = Webhook(message, retriever, dispatcher)

# This prints the same thing as before:
# sending
# {
#   'ref': '4635cfe0-825e-4f40-9c7b-04275b1c809e',
#   'body': 'Hello World'
# } to https://webhook.site/foo
webhook.send()

We’ve agreed that the BaseWebhook class tries to do too many things at once. The first step to disentangling a class is to identify its responsibilities and create multiple component classes where each new class will only have one responsibility. Here, the base class retrieves the necessary data and dispatches the webhook using that data at the same time. The Retriever and Dispatcher protocol classes will formalize the shape and structure of those component classes. These protocols work like the ABCs, but you don’t need to inherit from them to ensure interface conformity; the type checker will do it for you.

The Retriever class has two methods: get_message and get_url, which fetch message and URL data respectively. Similarly, the Dispatcher protocol has only a dispatch method that sends the webhook. In either case, the protocol methods don’t implement anything; they work just like the abstract methods of the ABCs, and the protocol classes themselves can’t be instantiated. Then the HookRetriever and HookDispatcher components implicitly implement the protocol classes. Notice that neither of the components inherits from the protocol classes. The type checker will ensure that they conform to the defined protocols.

The question is, how does the type checker know which class is supposed to conform to which protocol? The answer lies in the final Webhook class. We define a final dataclass that takes instances of the Message, Retriever, and Dispatcher classes in the constructor. Notice that while adding type hints to the retriever and dispatcher parameters of the dataclass constructor, we’re using the protocol classes instead of the concrete ones. This is how the type checker knows that whatever instance is passed to the retriever and dispatcher parameters must conform to the Retriever and Dispatcher protocols, respectively. Note that we’ve completely eliminated subclassing from our public API. Injecting dependencies in this manner is also known as the strategy pattern¹².

The Webhook class now has a hierarchical namespace instead of a flat one, unlike our inheritance-based friend. You’ll have to be explicit about where a method is coming from when calling it. So if you need to access the fetched URL, you’ll need to explicitly call self.retriever.get_url(). The self namespace has only one user-defined public method, .send(), which can be called to dispatch the webhook from a Webhook instance. This also means you no longer have to deal with goose chase program flow since all the dependencies flow towards the final Webhook class.

On the flip side, you’ll need to do more work while initializing the Webhook class. The Message, HookRetriever, and HookDispatcher classes need to be instantiated first and then passed explicitly to the constructor of the Webhook class to instantiate it. You’re basically trading writing ergonomics for readability. Instantiating the template subclass was a lot easier for sure.

Tradeoffs

Opting in for composition isn’t free, as it usually leads to more verbose code orchestration. If you’re passing all the dependencies explicitly, as shown above, wiring the code together will be more complex. However, in return, you get a more readable and testable design substrate. So, I’m more than happy to make the tradeoff. Additionally, avoiding namespace pollution means that one attribute access has now turned into two or more attribute accesses, which can cause performance issues in tight conditions.

Moreover, you can’t just take your inheritance-heavy API and suddenly turn it into a composable one. It usually requires planning and designing from the ground up, where you might decide that the ROI isn’t good enough to justify the effort of refactoring. Plus, in a language like Python, you can’t always escape inheritance, nor should you try to do so.

Yet behold, it need not be the customary stratagem that thou graspest at each moment thine heart yearns to commune code amidst classes.

~~~

An epiphany#

The infectious template pattern#

The dark side of the moon#

Elusive public API#

Namespace pollution#

SRP violation & goose chase program flow#

The disease and the cure#

Designing with template pattern#

Finding salvation in strategy pattern#

Tradeoffs#

Recent posts