Safer 'operator.itemgetter' in Python

Python’s operator.itemgetter is quite versatile. It works on pretty much any iterables and map-like objects and allows you to fetch elements from them. The following snippet shows how you can use it to sort a list of tuples by the first element of the tuple: In [2]: from operator import itemgetter ...: ...: l = [(10, 9), (1, 3), (4, 8), (0, 55), (6, 7)] ...: l_sorted = sorted(l, key=itemgetter(0)) In [3]: l_sorted Out[3]: [(0, 55), (1, 3), (4, 8), (6, 7), (10, 9)] Here, the itemgetter callable is doing the work of selecting the first element of every tuple inside the list and then the sorted function is using those values to sort the elements....

June 16, 2022

Guard clause and exhaustiveness checking

Nested conditionals suck. They’re hard to write and even harder to read. I’ve rarely regretted the time I’ve spent optimizing for the flattest conditional structure in my code. The following piece mimics the actions of a traffic signal: // src.ts enum Signal { YELLOW = "Yellow", RED = "Red", GREEN = "Green", } function processSignal(signal: Signal) :void { if (signal === Signal.YELLOW) { console.log("Slow down!"); } else { if (signal === Signal....

May 22, 2022

Return JSON error payload instead of HTML text in DRF

At my workplace, we have a large Django monolith that powers the main website and works as the primary REST API server at the same time. We use Django Rest Framework (DRF) to build and serve the API endpoints. This means, whenever there’s an error, based on the incoming request header—we’ve to return different formats of error responses to the website and API users. The default DRF configuration returns a JSON response when the system experiences an HTTP 400 (bad request) error....

April 13, 2022

Decoupling producers and consumers of iterables with generators in Python

Generators can help you decouple the production and consumption of iterables—making your code more readable and maintainable. I learned this trick a few years back from David Beazley’s slides1 on generators. Consider this example: # src.py from __future__ import annotations import time from typing import NoReturn def infinite_counter(start: int, step: int) -> NoReturn: i = start while True: time.sleep(1) # Not to flood stdout print(i) i += step infinite_counter(1, 2) # Prints # 1 # 3 # 5 # ....

April 3, 2022

Pre-allocated lists in Python

In CPython, elements of a list are stored as pointers to the elements rather than the values of the elements themselves. This is evident from the struct1 that represents a list in C: // Fetched from CPython main branch. Removed comments for brevity. typedef struct { PyObject_VAR_HEAD PyObject **ob_item; /* Pointer reference to the element. */ Py_ssize_t allocated; }PyListObject; An empty list builds a PyObject and occupies some memory: from sys import getsizeof l = [] print(getsizeof(l)) This returns:...

March 27, 2022

Disallow large file download from URLs in Python

I was working on a DRF POST API endpoint where the consumer is expected to add a URL containing a PDF file and the system would then download the file and save it to an S3 bucket. While this sounds quite straightforward, there’s one big issue. Before I started working on it, the core logic looked like this: # src.py from __future__ import annoatations from urllib.request import urlopen import tempfile from shutil import copyfileobj def save_to_s3(src_url: str, dest_url: str) -> None: with tempfile....

March 23, 2022

Declaratively transform data class fields in Python

While writing microservices in Python, I like to declaratively define the shape of the data coming in and out of JSON APIs or NoSQL databases in a separate module. Both TypedDict and dataclass are fantastic tools to communicate the shape of the data with the next person working on the codebase. Whenever I need to do some processing on the data before starting to work on that, I prefer to transform the data via dataclasses....

March 20, 2022

Caching connection objects in Python

To avoid instantiating multiple DB connections in Python apps, a common approach is to initialize the connection objects in a module once and then import them everywhere. So, you’d do this: # src.py import boto3 # Pip install boto3 import redis # Pip install redis dynamo_client = boto3.client("dynamodb") redis_client = redis.Redis() However, this adds import time side effects to your module and can turn out to be expensive. In search of a better solution, my first instinct was to go for functools....

March 16, 2022

How not to run a script in Python

When I first started working with Python, nothing stumped me more than how bizarre Python’s import system seemed to be. Often time, I wanted to run a module inside of a package with the python src/sub/module.py command, and it’d throw an ImportError that didn’t make any sense. Consider this package structure: src ├── __init__.py ├── a.py └── sub ├── __init__.py └── b.py Let’s say you’re importing module a in module b:...

March 16, 2022

Mocking chained methods of datetime objects in Python

This is the 4th time in a row that I’ve wasted time figuring out how to mock out a function during testing that calls the chained methods of a datetime.datetime object in the function body. So I thought I’d document it here. Consider this function: # src.py from __future__ import annotations import datetime def get_utcnow_isoformat() -> str: """Get UTCnow as an isoformat compliant string.""" return datetime.datetime.utcnow().isoformat() How’d you test it? Mocking out datetime....

March 16, 2022

Declarative payloads with TypedDict in Python

While working with microservices in Python, a common pattern that I see is—the usage of dynamically filled dictionaries as payloads of REST APIs or message queues. To understand what I mean by this, consider the following example: # src.py from __future__ import annotations import json from typing import Any import redis # Do a pip install. def get_payload() -> dict[str, Any]: """Get the 'zoo' payload containing animal names and attributes.""" payload = {"name": "awesome_zoo", "animals": []} names = ("wolf", "snake", "ostrich") attributes = ( {"family": "Canidae", "genus": "Canis", "is_mammal": True}, {"family": "Viperidae", "genus": "Boas", "is_mammal": False}, ) for name, attr in zip(names, attributes): payload["animals"]....

March 11, 2022

Parametrized fixtures in pytest

While most of my pytest fixtures don’t react to the dynamically-passed values of function parameters, there have been situations where I’ve definitely felt the need for that. Consider this example: # test_src.py import pytest @pytest.fixture def create_file(tmp_path): """Fixture to create a file in the tmp_path/tmp directory.""" directory = tmp_path / "tmp" directory.mkdir() file = directory / "foo.md" # The filename is hardcoded here! yield directory, file def test_file_creation(create_file): """Check the fixture....

March 10, 2022

Modify iterables while iterating in Python

If you try to mutate a sequence while traversing through it, Python usually doesn’t complain. For example: # src.py l = [3, 4, 56, 7, 10, 9, 6, 5] for i in l: if not i % 2 == 0: continue l.remove(i) print(l) The above snippet iterates through a list of numbers and modifies the list l in-place to remove any even number. However, running the script prints out this:...

March 4, 2022

Github action template for Python based projects

Five traits that almost all the GitHub Action workflows in my Python projects share are: If a new workflow is triggered while the previous one is running, the first one will get canceled. The CI is triggered every day at UTC 1. Tests and the lint-checkers are run on Ubuntu and MacOS against multiple Python versions. Pip dependencies are cached. Dependencies, including the Actions dependencies are automatically updated via dependabot1. I use pip-tools2 for managing dependencies in applications and setuptools3 setup....

March 2, 2022

Self type in Python

PEP-6731 introduces the Self type and it’s coming to Python 3.11. However, you can already use that now via the typing_extenstions2 module. The Self type makes annotating methods that return the instances of the corresponding classes trivial. Before this, you’d have to do some mental gymnastics to statically type situations as follows: # src.py from __future__ import annotations from typing import Any class Animal: def __init__(self, name: str, says: str) -> None: self....

February 28, 2022

Patching test dependencies via pytest fixture & unittest mock

In Python, even though I adore writing tests in a functional manner via pytest, I still have a soft corner for the tools provided in the unittest.mock module. I like the fact it’s baked into the standard library and is quite flexible. Moreover, I’m yet to see another mock library in any other language or in the Python ecosystem that allows you to mock your targets in such a terse, flexible, and maintainable fashion....

February 27, 2022

Narrowing types with TypeGuard in Python

Static type checkers like Mypy follow your code flow and statically try to figure out the types of the variables without you having to explicitly annotate inline expressions. For example: # src.py from __future__ import annotations def check(x: int | float) -> str: if not isinstance(x, int): reveal_type(x) # Type is now 'float'. else: reveal_type(x) # Type is now 'int'. return str(x) The reveal_type function is provided by Mypy and you don’t need to import this....

February 23, 2022

Why 'NoReturn' type exists in Python

Technically, the type of None in Python is NoneType. However, you’ll rarely see types.NoneType being used in the wild as the community has pretty much adopted None to denote the type of the None singleton. This usage is also documented1 in PEP-484. Whenever a callable doesn’t return anything, you usually annotate it as follows: # src.py from __future__ import annotations def abyss() -> None: return But sometimes a callable raises an exception and never gets the chance to return anything....

February 21, 2022

Add extra attributes to enum members in Python

While grokking the source code of http.HTTPStatus module, I came across this technique to add extra attributes to the values of enum members. Now, to understand what do I mean by adding attributes, let’s consider the following example: # src.py from __future__ import annotations from enum import Enum class Color(str, Enum): RED = "Red" GREEN = "Green" BLUE = "Blue" Here, I’ve inherited from str to ensure that the values of the enum members are strings....

February 17, 2022

Peeking into the internals of Python's 'functools.wraps' decorator

The functools.wraps decorator allows you to keep your function’s identity intact after it’s been wrapped by a decorator. Whenever a function is wrapped by a decorator, identity properties like—function name, docstring, annotations of it get replaced by those of the wrapper function. Consider this example: from __future__ import annotations # In < Python 3.9, import this from the typing module. from collections.abc import Callable from typing import Any def log(func: Callable) -> Callable: def wrapper(*args: Any, **kwargs: Any) -> Any: """Internal wrapper....

February 14, 2022