Structured concurrency & Go

· 14 min

At my workplace, a lot of folks are coming to Go from Python and Kotlin. Both languages have structured concurrency built into their async runtimes, and people are often surprised that Go doesn’t. The go statement just launches a goroutine and walks away. There’s no scope that waits for it, no automatic cancellation if the parent dies, no built-in way to collect its errors.

This post looks at where the idea of structured concurrency comes from, what it looks like in Python and Kotlin, and how you get the same behavior in Go using errgroup, WaitGroup, and context.

Hierarchical rate limiting with Redis sorted sets

· 8 min

Recently at work, we ran into this problem:

We needed to send Slack notifications for specific events but had to enforce rate limits to avoid overwhelming the channel. Here’s how the limits worked:

  • Global limit: Max 100 requests every 30 minutes.
  • Category limit: Each event type (e.g., errors, warnings) capped at 10 requests per 30 minutes.

Now, imagine this:

  1. There are 20 event types.
  2. Each type hits its 10-notification limit in 30 minutes.
  3. That’s 200 requests total, but the global limit only allows 100. So, 100 requests must be dropped - even if some event types still have room under their individual caps.

This created a hierarchy of limits:

Running only a single instance of a process

· 4 min

I’ve been having a ton of fun fiddling with Tailscale over the past few days. While setting it up on a server, I came across this ufw firewall script that configures the firewall on Linux to ensure direct communication across different nodes in my tailnet. It has the following block of code that I found interesting (added comments for clarity):

#!/usr/bin/env bash

# Define PID file path using script's name for uniqueness
PIDFILE="/tmp/$(basename "${BASH_SOURCE[0]%.*}.pid")"

# Open file descriptor 200 for the PID file
exec 200>"${PIDFILE}"

# Try to acquire a non-blocking lock; exit if the script is already running
flock -n 200 \
    || {
        echo "${BASH_SOURCE[0]} already running. Aborting..."; exit 1;
    }

# Store the current process ID (PID) in the lock file for reference
PID=$$
echo "${PID}" 1>&200

# Do work (in the original script, real work happens here)
sleep 999

Here, flock is a Linux command that ensures only one instance of the script runs at a time by locking a specified file (e.g., PIDFILE) through a file descriptor (e.g., 200). If another process already holds the lock, the script either waits or exits immediately. Above, it bails with an error message and exit code 1.

Injecting Pytest fixtures without cluttering test signatures

· 2 min

Sometimes, when writing tests in Pytest, I find myself using fixtures that the test function/method doesn’t directly reference. Instead, Pytest runs the fixture, and the test function implicitly leverages its side effects. For example:

import os
from collections.abc import Iterator
from unittest.mock import Mock, patch
import pytest


# Define an implicit environment mock fixture that patches os.environ
@pytest.fixture
def mock_env() -> Iterator[None]:
    with patch.dict("os.environ", {"IMPLICIT_KEY": "IMPLICIT_VALUE"}):
        yield


# Define an explicit service mock fixture
@pytest.fixture
def mock_svc() -> Mock:
    service = Mock()
    service.process.return_value = "Explicit Mocked Response"
    return service


# IDEs tend to dim out unused parameters like mock_env
def test_stuff(mock_svc: Mock, mock_env: Mock) -> None:
    # Use the explicit mock
    response = mock_svc.process()
    assert response == "Explicit Mocked Response"
    mock_svc.process.assert_called_once()

    # Assert the environment variable patched by mock_env
    assert os.environ["IMPLICIT_KEY"] == "IMPLICIT_VALUE"

In the test_stuff function above, we directly use the mock_svc fixture but not mock_env. Instead, we expect Pytest to run mock_env, which modifies the environment variables. This works, but IDEs often mark mock_env as an unused parameter and dims it out.

Explicit method overriding with @typing.override

· 2 min

Although I’ve been using Python 3.12 in production for nearly a year, one neat feature in the typing module that escaped me was the @override decorator. Proposed in PEP 698, it’s been hanging out in typing_extensions for a while. This is one of those small features you either don’t care about or get totally psyched over. I’m definitely in the latter camp.

In languages like C#, Java, and Kotlin, explicit overriding is required. For instance, in Java, you use @Override to make it clear you’re overriding a method in a sub class. If you mess up the method name or if the method doesn’t exist in the superclass, the compiler throws an error. Now, with Python’s @override decorator, we get similar benefits - though only if you’re using a static type checker.

Quicker startup with module-level __getattr__

· 4 min

This morning, someone on Twitter pointed me to PEP 562, which introduces __getattr__ and __dir__ at the module level. While __dir__ helps control which attributes are printed when calling dir(module), __getattr__ is the more interesting addition.

The __getattr__ method in a module works similarly to how it does in a Python class. For example:

class Cat:
    def __getattr__(self, name: str) -> str:
        if name == "voice":
            return "meow!!"
        raise AttributeError(f"Attribute {name} does not exist")


# Try to access 'voice' on Cat
cat = Cat()
cat.voice  # Prints "meow!!"

# Raises AttributeError: Attribute something_else does not exist
cat.something_else

In this class, __getattr__ defines what happens when specific attributes are accessed, allowing you to manage how missing attributes behave. Since Python 3.7, you can also define __getattr__ at the module level to handle attribute access on the module itself.

Shades of testing HTTP requests in Python

· 5 min

Here’s a Python snippet that makes an HTTP POST request:

# script.py

import httpx
from typing import Any


async def make_request(url: str) -> dict[str, Any]:
    headers = {"Content-Type": "application/json"}

    async with httpx.AsyncClient(headers=headers) as client:
        response = await client.post(
            url,
            json={"key_1": "value_1", "key_2": "value_2"},
        )
        return response.json()

The function make_request makes an async HTTP request with the HTTPx library. Running this with asyncio.run(make_request("https://httpbin.org/post")) gives us the following output:

{
  "args": {},
  "data": "{\"key_1\": \"value_1\", \"key_2\": \"value_2\"}",
  "files": {},
  "form": {},
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Content-Length": "40",
    "Content-Type": "application/json",
    "Host": "httpbin.org",
    "User-Agent": "python-httpx/0.27.2",
    "X-Amzn-Trace-Id": "Root=1-66d5f7b0-2ed0ddc57241f0960f28bc91"
  },
  "json": {
    "key_1": "value_1",
    "key_2": "value_2"
  },
  "origin": "95.90.238.240",
  "url": "https://httpbin.org/post"
}

We’re only interested in the json field and want to assert in our test that making the HTTP call returns the expected values.

Taming parametrize with pytest.param

· 4 min

I love pytest.mark.parametrize - so much so that I sometimes shoehorn my tests to fit into it. But the default style of writing tests with parametrize can quickly turn into an unreadable mess as the test complexity grows. For example:

import pytest
from math import atan2


def polarify(x: float, y: float) -> tuple[float, float]:
    r = (x**2 + y**2) ** 0.5
    theta = atan2(y, x)
    return r, theta


@pytest.mark.parametrize(
    "x, y, expected",
    [
        (0, 0, (0, 0)),
        (1, 0, (1, 0)),
        (0, 1, (1, 1.5707963267948966)),
        (1, 1, (2**0.5, 0.7853981633974483)),
        (-1, -1, (2**0.5, -2.356194490192345)),
    ],
)
def test_polarify(x: float, y: float, expected: tuple[float, float]) -> None:
    # pytest.approx helps us ignore floating point discrepancies
    assert polarify(x, y) == pytest.approx(expected)

The polarify function converts Cartesian coordinates to polar coordinates. We’re using @pytest.mark.parametrize in its standard form to test different conditions.

Log context propagation in Python ASGI apps

· 7 min

Let’s say you have a web app that emits log messages from different layers. Your log shipper collects and sends these messages to a destination like Datadog where you can query them. One common requirement is to tag the log messages with some common attributes, which you can use later to query them.

In distributed tracing, this tagging is usually known as context propagation, where you’re attaching some contextual information to your log messages that you can use later for query purposes. However, if you have to collect the context at each layer of your application and pass it manually to the downstream ones, that’d make the whole process quite painful.

Please don't hijack my Python root logger

· 5 min

With the recent explosion of LLM tools, I often like to kill time fiddling with different LLM client libraries and SDKs in one-off scripts. Lately, I’ve noticed that some newer tools frequently mess up the logger settings, meddling with my application logs. While it’s less common in more seasoned libraries, I guess it’s worth rehashing why hijacking the root logger isn’t a good idea when writing libraries or other forms of reusable code.