Python

Hierarchical rate limiting with Redis sorted sets

Recently at work, we ran into this problem: We needed to send Slack notifications for specific events but had to enforce rate limits to avoid overwhelming the channel. Here’s how the limits worked: Global limit: Max 100 requests every 30 minutes. Category limit: Each event type (e.g., errors, warnings) capped at 10 requests per 30 minutes. Now, imagine this: There are 20 event types. Each type hits its 10-notification limit in 30 minutes. That’s 200 requests total, but the global limit only allows 100. So, 100 requests must be dropped—even if some event types still have room under their individual caps. This created a hierarchy of limits: ...

Running only a single instance of a process

I’ve been having a ton of fun fiddling with Tailscale1 over the past few days. While setting it up on a server, I came across this shell script2 that configures the ufw firewall on Linux to ensure direct communication across different nodes in my tailnet. It has the following block of code that I found interesting (added comments for clarity): #!/usr/bin/env bash # Define the path for the PID file, using the script's name to ensure uniqueness PIDFILE="/tmp/$(basename "${BASH_SOURCE[0]%.*}.pid")" # Open file descriptor 200 for the PID file exec 200>"${PIDFILE}" # Try to acquire a non-blocking lock; exit if the script is already running flock -n 200 \ || { echo "${BASH_SOURCE[0]} script is already running. Aborting..."; exit 1; } # Store the current process ID (PID) in the lock file for reference PID=$$ echo "${PID}" 1>&200 # Do work (in the original script, real work happens here) sleep 999 Here, flock is a Linux command that ensures only one instance of the script runs at a time by locking a specified file (e.g., PIDFILE) through a file descriptor (e.g., 200). If another process already holds the lock, the script either waits or exits immediately. Above, it bails with an error message and exit code 1. ...

Injecting Pytest fixtures without cluttering test signatures

Sometimes, when writing tests in Pytest, I find myself using fixtures that the test function/method doesn’t directly reference. Instead, Pytest runs the fixture, and the test function implicitly leverages its side effects. For example: import os from collections.abc import Iterator from unittest.mock import Mock, patch import pytest # Define an implicit environment mock fixture that patches os.environ @pytest.fixture def mock_env() -> Iterator[None]: with patch.dict("os.environ", {"IMPLICIT_KEY": "IMPLICIT_VALUE"}): yield # Define an explicit service mock fixture @pytest.fixture def mock_svc() -> Mock: service = Mock() service.process.return_value = "Explicit Mocked Response" return service # IDEs tend to dim out unused parameters like mock_env def test_stuff(mock_svc: Mock, mock_env: Mock) -> None: # Use the explicit mock response = mock_svc.process() assert response == "Explicit Mocked Response" mock_svc.process.assert_called_once() # Assert the environment variable patched by mock_env assert os.environ["IMPLICIT_KEY"] == "IMPLICIT_VALUE" In the test_stuff function above, we directly use the mock_svc fixture but not mock_env. Instead, we expect Pytest to run mock_env, which modifies the environment variables. This works, but IDEs often mark mock_env as an unused parameter and dims it out. ...

Explicit method overriding with @typing.override

Although I’ve been using Python 3.12 in production for nearly a year, one neat feature in the typing module that escaped me was the @override decorator. Proposed in PEP-6981, it’s been hanging out in typing_extensions for a while. This is one of those small features you either don’t care about or get totally psyched over. I’m definitely in the latter camp. In languages like C#, Java, and Kotlin, explicit overriding is required. For instance, in Java, you use @Override to make it clear you’re overriding a method in a sub class. If you mess up the method name or if the method doesn’t exist in the superclass, the compiler throws an error. Now, with Python’s @override decorator, we get similar benefits—though only if you’re using a static type checker. ...

Quicker startup with module-level getattr

This morning, someone on Twitter pointed me to PEP 5621, which introduces __getattr__ and __dir__ at the module level. While __dir__ helps control which attributes are printed when calling dir(module), __getattr__ is the more interesting addition. The __getattr__ method in a module works similarly to how it does in a Python class. For example: class Cat: def __getattr__(self, name: str) -> str: if name == "voice": return "meow!!" raise AttributeError(f"Attribute {name} does not exist") # Try to access 'voice' on Cat cat = Cat() cat.voice # Prints "meow!!" # Raises AttributeError: Attribute something_else does not exist cat.something_else In this class, __getattr__ defines what happens when specific attributes are accessed, allowing you to manage how missing attributes behave. Since Python 3.7, you can also define __getattr__ at the module level to handle attribute access on the module itself. ...

Taming parametrize with pytest.param

I love @pytest.mark.parametrize1—so much so that I sometimes shoehorn my tests to fit into it. But the default style of writing tests with parametrize can quickly turn into an unreadable mess as the test complexity grows. For example: import pytest from math import atan2 def polarify(x: float, y: float) -> tuple[float, float]: r = (x**2 + y**2) ** 0.5 theta = atan2(y, x) return r, theta @pytest.mark.parametrize( "x, y, expected", [ (0, 0, (0, 0)), (1, 0, (1, 0)), (0, 1, (1, 1.5707963267948966)), (1, 1, (2**0.5, 0.7853981633974483)), (-1, -1, (2**0.5, -2.356194490192345)), ], ) def test_polarify(x: float, y: float, expected: tuple[float, float]) -> None: # pytest.approx helps us ignore floating point discrepancies assert polarify(x, y) == pytest.approx(expected) The polarify function converts Cartesian coordinates to polar coordinates. We’re using @pytest.mark.parametrize in its standard form to test different conditions. ...

Log context propagation in Python ASGI apps

Let’s say you have a web app that emits log messages from different layers. Your log shipper collects and sends these messages to a destination like Datadog where you can query them. One common requirement is to tag the log messages with some common attributes, which you can use later to query them. In distributed tracing, this tagging is usually known as context propagation1, where you’re attaching some contextual information to your log messages that you can use later for query purposes. However, if you have to collect the context at each layer of your application and pass it manually to the downstream ones, that’d make the whole process quite painful. ...

Please don't hijack my Python root logger

With the recent explosion of LLM tools, I often like to kill time fiddling with different LLM client libraries and SDKs in one-off scripts. Lately, I’ve noticed that some newer tools frequently mess up the logger settings, meddling with my application logs. While it’s less common in more seasoned libraries, I guess it’s worth rehashing why hijacking the root logger isn’t a good idea when writing libraries or other forms of reusable code. ...

TypeIs does what I thought TypeGuard would do in Python

The handful of times I’ve reached for typing.TypeGuard in Python, I’ve always been confused by its behavior and ended up ditching it with a # type: ignore comment. For the uninitiated, TypeGuard allows you to apply custom type narrowing1. For example, let’s say you have a function named pretty_print that accepts a few different types and prints them differently onto the console: from typing import assert_never def pretty_print(val: int | float | str) -> None: if isinstance(val, int): # assert_type(val, int) print(f"Integer: {val}") elif isinstance(val, float): # assert_type(val, float) print(f"Float: {val}") elif isinstance(val, str): # assert_type(val, str) print(f"String: {val}") else: assert_never(val) If you run it through mypy, in each branch, the type checker automatically narrows the type and knows exactly what the type of val is. You can test the narrowed type in each branch with the typing.assert_type function. ...

Patching pydantic settings in pytest

I’ve been a happy user of pydantic1 settings to manage all my app configurations since the 1.0 era. When pydantic 2.0 was released, the settings portion became a separate package called pydantic_settings2. It does two things that I love: it automatically reads the environment variables from the .env file and allows you to declaratively convert the string values to their desired types like integers, booleans, etc. Plus, it lets you override the variables defined in .env by exporting them in your shell. ...

Eschewing black box API calls

I love dynamically typed languages as much as the next person. They let us make ergonomic API calls like this: import httpx # Sync call for simplicity r = httpx.get("https://dummyjson.com/products/1").json() print(r["id"], r["title"], r["description"]) or this: fetch("https://dummyjson.com/products/1") .then((res) => res.json()) .then((json) => console.log(json.id, json.type, json.description)); In both cases, running the snippets will return: 1 'iPhone 9' 'An apple mobile which is nothing like apple' Unless you’ve worked with a statically typed language that enforces more constraints, it’s hard to appreciate how incredibly convenient it is to be able to call and use an API endpoint without having to deal with types or knowing anything about its payload structure. You can treat the API response as a black box and deal with everything in runtime. ...

Statically enforcing frozen data classes in Python

You can use @dataclass(frozen=True) to make instances of a data class immutable during runtime. However, there’s a small caveat—instantiating a frozen data class is slightly slower than a non-frozen one. This is because, when you enable frozen=True, Python has to generate __setattr__ and __delattr__ methods during class definition time and invoke them for each instantiation. Below is a quick benchmark comparing the instantiation times of a mutable dataclass and a frozen one (in Python 3.12): ...

Debugging dockerized Python apps in VSCode

Despite using VSCode as my primary editor, I never really bothered to set up the native debugger to step through application code running inside Docker containers. Configuring the debugger to work with individual files, libraries, or natively running servers is trivial1. So, I use it in those cases and just resort back to my terminal for debugging containerized apps running locally. However, after seeing a colleague’s workflow in a pair-programming session, I wanted to configure the debugger to cover this scenario too. ...

Banish state-mutating methods from data classes

Data classes are containers for your data—not behavior. The delineation is right there in the name. Yet, I see state-mutating methods getting crammed into data classes and polluting their semantics all the time. While this text will primarily talk about data classes in Python, the message remains valid for any language that supports data classes and allows you to add state-mutating methods to them, e.g., Kotlin, Swift, etc. By state-mutating method, I mean methods that change attribute values during runtime. For instance: ...

Taming conditionals with bitmasks

The 100k context window of Claude 21 has been a huge boon for me since now I can paste a moderately complex problem to the chat window and ask questions about it. In that spirit, it recently refactored some pretty gnarly conditional logic for me in such an elegant manner that it absolutely blew me away. Now, I know how bitmasks2 work and am aware of the existence of enum.Flag3 in Python. However, it never crossed my mind that flags can be leveraged to trim conditional branches in such a clever manner that Claude illustrated. But once I looked at the proposed solution, the whole thing immediately clicked for me. ...

Memory leakage in Python descriptors

Unless I’m hand rolling my own ORM-like feature or validation logic, I rarely need to write custom descriptors in Python. The built-in descriptor magics like @classmethod, @property, @staticmethod, and vanilla instance methods usually get the job done. However, every time I need to dig my teeth into descriptors, I reach for this fantastic how to1 guide by Raymond Hettinger. You should definitely set aside the time to read it if you haven’t already. It has helped me immensely to deepen my understanding of how many of the fundamental language constructs are wired together underneath. ...

Unix-style pipelining with Python's subprocess module

Python offers a ton of ways like os.system or os.spawn* to create new processes and run arbitrary commands in your system. However, the documentation usually encourages you to use the subprocess1 module for creating and managing child processes. The subprocess module exposes a high-level run() function that provides a simple interface for running a subprocess and waiting for it to complete. It accepts the command to run as a list of strings, starts the subprocess, waits for it to finish, and then returns a CompletedProcess object with information about the result. For example: ...

Enabling repeatable lazy iterations in Python

The current title of this post is probably incorrect and may even be misleading. I had a hard time coming up with a suitable name for it. But the idea goes like this: sometimes you might find yourself in a situation where you need to iterate through a generator more than once. Sure, you can use an iterable like a tuple or list to allow multiple iterations, but if the number of elements is large, that’ll cause an OOM error. On the other hand, once you’ve already consumed a generator, you’ll need to restart it if you want to go through it again. This behavior is common in pretty much every programming language that supports the generator construct. ...

Escaping the template pattern hellscape in Python

Over the years, I’ve used the template pattern1 across multiple OO languages with varying degrees of success. It was one of the first patterns I learned in the primordial hours of my software engineering career, and for some reason, it just feels like the natural way to tackle many real-world code-sharing problems. Yet, even before I jumped on board with the composition over inheritance2 camp, I couldn’t help but notice how using this particular inheritance technique spawns all sorts of design and maintenance headaches as the codebase starts to grow. ...

Python dependency management redux

One major drawback of Python’s huge ecosystem is the significant variances in workflows among people trying to accomplish different things. This holds true for dependency management as well. Depending on what you’re doing with Python—whether it’s building reusable libraries, writing web apps, or diving into data science and machine learning—your workflow can look completely different from someone else’s. That being said, my usual approach to any development process is to pick a method and give it a shot to see if it works for my specific needs. Once a process works, I usually automate it and rarely revisit it unless something breaks. ...