Explicit method overriding with @typing.override
Although I’ve been using Python 3.12 in production for nearly a year, one neat feature in the typing module that escaped me was the @override decorator. Proposed in PEP-6981, it’s been hanging out in typing_extensions for a while. This is one of those small features you either don’t care about or get totally psyched over. I’m definitely in the latter camp. In languages like C#, Java, and Kotlin, explicit overriding is required. For instance, in Java, you use @Override to make it clear you’re overriding a method in a sub class. If you mess up the method name or if the method doesn’t exist in the superclass, the compiler throws an error. Now, with Python’s @override decorator, we get similar benefits—though only if you’re using a static type checker. ...
Quicker startup with module-level __getattr__
This morning, someone on Twitter pointed me to PEP 5621, which introduces __getattr__ and __dir__ at the module level. While __dir__ helps control which attributes are printed when calling dir(module), __getattr__ is the more interesting addition. The __getattr__ method in a module works similarly to how it does in a Python class. For example: class Cat: def __getattr__(self, name: str) -> str: if name == "voice": return "meow!!" raise AttributeError(f"Attribute {name} does not exist") # Try to access 'voice' on Cat cat = Cat() cat.voice # Prints "meow!!" # Raises AttributeError: Attribute something_else does not exist cat.something_else In this class, __getattr__ defines what happens when specific attributes are accessed, allowing you to manage how missing attributes behave. Since Python 3.7, you can also define __getattr__ at the module level to handle attribute access on the module itself. ...
Docker mount revisited
I always get tripped up by Docker’s different mount types and their syntax, whether I’m stringing together some CLI commands or writing a docker-compose file. Docker’s docs cover these, but for me, the confusion often comes from how “bind” is used in various contexts and how “volume” and “bind” sometimes get mixed up in the documentation. Here’s my attempt to disentangle some of my most-used mount commands. Volume mounts Volume mounts1 let you store data outside the container in a location managed by Docker. The data persists even after the container stops. On non-Linux systems, volume mounts are faster than bind mounts because data doesn’t need to cross the virtualization boundary. ...
Topological sort
I was fiddling with graphlib in the Python stdlib and found it quite nifty. It processes a Directed Acyclic Graph (DAG), where tasks (nodes) are connected by directed edges (dependencies), and returns the correct execution order. The “acyclic” part ensures no circular dependencies. Topological sorting is useful for arranging tasks so that each one follows its dependencies. It’s widely used in scheduling, build systems, dependency resolution, and database migrations. For example, consider these tasks: ...
Writing a circuit breaker in Go
Besides retries, circuit breakers1 are probably one of the most commonly employed resilience patterns in distributed systems. While writing a retry routine is pretty simple, implementing a circuit breaker needs a little bit of work. I realized that I usually just go for off-the-shelf libraries for circuit breaking and haven’t written one from scratch before. So, this is an attempt to create a sloppy one in Go. I picked Go instead of Python because I didn’t want to deal with sync-async idiosyncrasies or abstract things away under a soup of decorators. ...
Discovering direnv
I’m not really a fan of shims—code that automatically performs actions as a side effect or intercepts commands when you use the shell or when a prompt runs. That’s why, other than the occasional dabbling, I’ve mostly stayed away from tools like asdf or pyenv and instead stick to apt or brew for managing my binary installs, depending on the OS. Recently, though, I’ve started seeing many people I admire extolling direnv: ...
Notes on building event-driven systems
I spent the evening watching this incredibly grokkable talk on event-driven services by James Eastham at NDC London 2024. Below is a cleaned-up version of my notes. I highly recommend watching the full talk if you’re interested before reading this distillation. The curse of tightly coupled microservices Microservices often start with HTTP-based request-response communication, which seems straightforward but quickly becomes a pain as systems grow. Coupling—where one service depends on another—creates a few issues. Take the order processing service in a fictional Plant-Based Pizza company. It has to talk to the pickup service, delivery service, kitchen, and loyalty point service. They’re all tied together, so if one fails, the whole system could go down. ...
Bash namerefs for dynamic variable referencing
While going through a script at work today, I came across Bash’s nameref feature. It uses declare -n ref="$1" to set up a variable that allows you to reference another variable by name—kind of like pass-by-reference in C. I’m pretty sure I’ve seen it before, but I probably just skimmed over it. As I dug into the man page1, I realized there’s a gap in my understanding of how variable references actually work in Bash—probably because I never gave it proper attention and just got by cobbling together scripts. ...
Behind the blog
When I started writing here about five years ago, I made a promise to myself that I wouldn’t give in to the trend of starting a blog, adding one overly enthusiastic entry about the stack behind it, and then vanishing into the ether. I was somewhat successful at that and wanted to write something I can link to when people are curious about the machinery that drives this site. The good thing is that the tech stack is simple and has remained stable over the years since I’ve only made changes when absolutely necessary. ...
Shell redirection syntax soup
I always struggle with the syntax for redirecting multiple streams to another command or a file. LLMs do help, but beyond the most obvious cases, it takes a few prompts to get the syntax right. When I know exactly what I’m after, scanning a quick post is much faster than wrestling with a non-deterministic kraken. So, here’s a list of the redirection and piping syntax I use the most, with real examples. ...
Shades of testing HTTP requests in Python
Here’s a Python snippet that makes an HTTP POST request: # script.py import httpx from typing import Any async def make_request(url: str) -> dict[str, Any]: headers = {"Content-Type": "application/json"} async with httpx.AsyncClient(headers=headers) as client: response = await client.post( url, json={"key_1": "value_1", "key_2": "value_2"}, ) return response.json() The function make_request makes an async HTTP request with the httpx1 library. Running this with asyncio.run(make_request("https://httpbin.org/post")) gives us the following output: { "args": {}, "data": "{\"key_1\": \"value_1\", \"key_2\": \"value_2\"}", "files": {}, "form": {}, "headers": { "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Content-Length": "40", "Content-Type": "application/json", "Host": "httpbin.org", "User-Agent": "python-httpx/0.27.2", "X-Amzn-Trace-Id": "Root=1-66d5f7b0-2ed0ddc57241f0960f28bc91" }, "json": { "key_1": "value_1", "key_2": "value_2" }, "origin": "95.90.238.240", "url": "https://httpbin.org/post" } We’re only interested in the json field and want to assert in our test that making the HTTP call returns the expected values. ...
Taming parametrize with pytest.param
I love @pytest.mark.parametrize1—so much so that I sometimes shoehorn my tests to fit into it. But the default style of writing tests with parametrize can quickly turn into an unreadable mess as the test complexity grows. For example: import pytest from math import atan2 def polarify(x: float, y: float) -> tuple[float, float]: r = (x**2 + y**2) ** 0.5 theta = atan2(y, x) return r, theta @pytest.mark.parametrize( "x, y, expected", [ (0, 0, (0, 0)), (1, 0, (1, 0)), (0, 1, (1, 1.5707963267948966)), (1, 1, (2**0.5, 0.7853981633974483)), (-1, -1, (2**0.5, -2.356194490192345)), ], ) def test_polarify(x: float, y: float, expected: tuple[float, float]) -> None: # pytest.approx helps us ignore floating point discrepancies assert polarify(x, y) == pytest.approx(expected) The polarify function converts Cartesian coordinates to polar coordinates. We’re using @pytest.mark.parametrize in its standard form to test different conditions. ...
HTTP requests via /dev/tcp
I learned this neat Bash trick today where you can make a raw HTTP request using the /dev/tcp file descriptor without using tools like curl or wget. This came in handy while writing a health check script that needed to make a TCP request to a service. The following script opens a TCP connection and makes a simple GET request to example.com: #!/bin/bash # Open a TCP connection to example.com on port 80 and assign file descriptor 3 # The exec command keeps /dev/fd/3 open throughout the lifetime of the script # 3<> enables bidirectional read-write exec 3<>/dev/tcp/example.com/80 # Send the HTTP GET request to the server # >& redirects stdout to /dev/fd/3 echo -e "GET / HTTP/1.1\r\nHost: example.com\r\nConnection: close\r\n\r\n" >&3 # Read and print the server's response # <& redirects the output of /dev/fd/3 to cat cat <&3 # Close the file descriptor, terminating the TCP connection exec 3>&- Running this will print the response from the site to your console. ...
Log context propagation in Python ASGI apps
Let’s say you have a web app that emits log messages from different layers. Your log shipper collects and sends these messages to a destination like Datadog where you can query them. One common requirement is to tag the log messages with some common attributes, which you can use later to query them. In distributed tracing, this tagging is usually known as context propagation1, where you’re attaching some contextual information to your log messages that you can use later for query purposes. However, if you have to collect the context at each layer of your application and pass it manually to the downstream ones, that’d make the whole process quite painful. ...
Please don't hijack my Python root logger
With the recent explosion of LLM tools, I often like to kill time fiddling with different LLM client libraries and SDKs in one-off scripts. Lately, I’ve noticed that some newer tools frequently mess up the logger settings, meddling with my application logs. While it’s less common in more seasoned libraries, I guess it’s worth rehashing why hijacking the root logger isn’t a good idea when writing libraries or other forms of reusable code. ...
The *nix install command
TIL about the install command on *nix systems. A quick GitHub search for the term brought up a ton of matches1. I’m surprised I just found out about it now. Often, in shell scripts I need to: Create a directory hierarchy Copy a config or binary file to the new directory Set permissions on the file It usually looks like this: # Create the directory hierarchy. The -p flag creates the parent directories # if they don't exist mkdir -p ~/.config/app # Copy the current config to the newly created directory. Here, conf already # exists in the current folder cp conf ~/.config/app/conf # Set the file permission chmod 755 ~/.config/app/conf Turns out, the install command in GNU coreutils2 can do all that in one line: ...
Here-doc headache
I was working on the deployment pipeline for a service that launches an app in a dedicated VM using GitHub Actions. In the last step of the workflow, the CI SSHs into the VM and runs several commands using a here document1 in bash. The simplified version looks like this: # SSH into the remote machine and run a bunch of commands to deploy the service ssh $SSH_USER@$SSH_HOST <<EOF # Go to the work directory cd $WORK_DIR # Make a git pull git pull # Export environment variables required for the service to run export AUTH_TOKEN=$APP_AUTH_TOKEN # Start the service docker compose up -d --build EOF The fully working version can be found on GitHub2. ...
The sane pull request
One of the reasons why I’m a big advocate of rebasing and cleaning up feature branches, even when the changes get squash-merged to the mainline, is that it makes the PR reviewer’s life a little easier. I’ve written about my rebase workflow before1 and learned a few new things from the Hacker News discussion2 around it. While there’s been no shortage of text on why and how to craft atomic commits3, I often find those discussions focus too much on VCS hygiene, and the main benefit gets lost in the minutiae. When working in a team setup, I’ve discovered that individual commits matter much less than the final change list. ...
I kind of like rebasing
People tend to get pretty passionate about Git workflows on different online forums. Some like to rebase, while others prefer to keep the disorganized records. Some dislike the extra merge commit, while others love to preserve all the historical artifacts. There’s merit to both sides of the discussion. That being said, I kind of like rebasing because I’m a messy committer who: Usually doesn’t care for keeping atomic commits1. Creates a lot of short commits with messages like “fix” or “wip”. Likes to clean up the untidy commits before sending the branch for peer review. Prefers a linear history over a forked one so that git log --oneline --graph tells a nice story. Git rebase allows me to squash my disordered commits into a neat little one, which bundles all the changes with passing tests and documentation. Sure, a similar result can be emulated using git merge --squash feat_branch or GitHub’s squash-merge feature, but to me, rebasing feels cleaner. Plus, over time, I’ve subconsciously picked up the tricks to work my way around rebase-related gotchas. ...
Protobuffed contracts
People typically associate Google’s Protocol Buffer1 with gRPC2 services, and rightfully so. But things often get confusing when discussing protobufs because the term can mean different things: A binary protocol for efficiently serializing structured data. A language used to specify how this data should be structured. In gRPC services, you usually use both: the protobuf language in proto files defines the service interface, and then the clients use the same proto files to communicate with the services. ...