Redowan's Reflections

Modify iterables while iterating in Python

If you try to mutate a sequence while traversing through it, Python usually doesn’t complain. For example: # src.py l = [3, 4, 56, 7, 10, 9, 6, 5] for i in l: if not i % 2 == 0: continue l.remove(i) print(l) The above snippet iterates through a list of numbers and modifies the list l in-place to remove any even number. However, running the script prints out this: [3, 56, 7, 9, 5] Wait a minute! The output doesn’t look correct. The final list still contains 56 which is an even number. Why did it get skipped? Printing the members of the list while the for-loop advances reveal what’s happening inside: ...

Github action template for Python based projects

Five traits that almost all the GitHub Action workflows in my Python projects share are: If a new workflow is triggered while the previous one is running, the first one will get canceled. The CI is triggered every day at UTC 1. Tests and the lint-checkers are run on Ubuntu and MacOS against multiple Python versions. Pip dependencies are cached. Dependencies, including the Actions dependencies are automatically updated via dependabot1. I use pip-tools2 for managing dependencies in applications and setuptools3 setup.py combo for managing dependencies in libraries. Here’s an annotated version of the template action syntax: ...

Self type in Python

PEP-6731 introduces the Self type and it’s coming to Python 3.11. However, you can already use that now via the typing_extenstions2 module. The Self type makes annotating methods that return the instances of the corresponding classes trivial. Before this, you’d have to do some mental gymnastics to statically type situations as follows: # src.py from __future__ import annotations from typing import Any class Animal: def __init__(self, name: str, says: str) -> None: self.name = name self.says = says @classmethod def from_description(cls, description: str = "|") -> Animal: descr = description.split("|") return cls(descr[0], descr[1]) class Dog(Animal): def __init__(self, *args: Any, **kwargs: Any) -> None: super().__init__(*args, **kwargs) @property def legs(self) -> int: return 4 if __name__ == "__main__": dog = Dog.from_description("Matt | woof") print(dog.legs) # Mypy complains here! The class Animal has a from_description class method that acts as an additional constructor. It takes a description string, and then builds and returns an instance of the same class. The return type of the method is annotated as Animal here. However, doing this makes the child class Dog conflate its identity with the Animal class. If you execute the snippet, it won’t raise any runtime error. Also, Mypy will complain about the type: ...

Patching test dependencies via pytest fixture & unittest mock

In Python, even though I adore writing tests in a functional manner via pytest, I still have a soft corner for the tools provided in the unittest.mock module. I like the fact it’s baked into the standard library and is quite flexible. Moreover, I’m yet to see another mock library in any other language or in the Python ecosystem that allows you to mock your targets in such a terse, flexible, and maintainable fashion. ...

Narrowing types with TypeGuard in Python

Static type checkers like Mypy follow your code flow and statically try to figure out the types of the variables without you having to explicitly annotate inline expressions. For example: # src.py from __future__ import annotations def check(x: int | float) -> str: if not isinstance(x, int): reveal_type(x) # Type is now 'float'. else: reveal_type(x) # Type is now 'int'. return str(x) The reveal_type function is provided by Mypy and you don’t need to import this. But remember to remove the function before executing the snippet. Otherwise, Python will raise a runtime error as the function is only understood by Mypy. If you run Mypy against this snippet, it’ll print the following lines: ...

Why 'NoReturn' type exists in Python

Technically, the type of None in Python is NoneType. However, you’ll rarely see types.NoneType being used in the wild as the community has pretty much adopted None to denote the type of the None singleton. This usage is also documented1 in PEP-484. Whenever a callable doesn’t return anything, you usually annotate it as follows: # src.py from __future__ import annotations def abyss() -> None: return But sometimes a callable raises an exception and never gets the chance to return anything. Consider this example: ...

Add extra attributes to enum members in Python

While grokking the source code of http.HTTPStatus module, I came across this technique to add extra attributes to the values of enum members. Now, to understand what do I mean by adding attributes, let’s consider the following example: # src.py from __future__ import annotations from enum import Enum class Color(str, Enum): RED = "Red" GREEN = "Green" BLUE = "Blue" Here, I’ve inherited from str to ensure that the values of the enum members are strings. This class can be used as follows: ...

Peeking into the internals of Python's 'functools.wraps' decorator

The functools.wraps decorator allows you to keep your function’s identity intact after it’s been wrapped by a decorator. Whenever a function is wrapped by a decorator, identity properties like—function name, docstring, annotations of it get replaced by those of the wrapper function. Consider this example: from __future__ import annotations # In < Python 3.9, import this from the typing module. from collections.abc import Callable from typing import Any def log(func: Callable) -> Callable: def wrapper(*args: Any, **kwargs: Any) -> Any: """Internal wrapper.""" val = func(*args, **kwargs) return val return wrapper @log def add(x: int, y: int) -> int: """Add two numbers. Parameters ---------- x : int First argument. y : int Second argument. Returns ------- int Returns the summation of two integers. """ return x + y if __name__ == "__main__": print(add.__doc__) print(add.__name__) Here, I’ve defined a simple logging decorator that wraps the add function. The function add has its own type annotations and docstring. So, you’d expect the docstring and name of the add function to be printed when the above snippet gets executed. However, running the script prints the following instead: ...

Limit concurrency with semaphore in Python asyncio

I was working with a rate-limited API endpoint where I continuously needed to send short-polling GET requests without hitting HTTP 429 error. Perusing the API doc, I found out that the API endpoint only allows a maximum of 100 requests per second. So, my goal was to find out a way to send the maximum amount of requests without encountering the too-many-requests error. I picked up Python’s asyncio and the amazing HTTPx library by Tom Christie to make the requests. This is the naive version that I wrote in the beginning; it quickly hits the HTTP 429 error: ...

Amphibian decorators in Python

Whether you like it or not, the split world of sync and async functions in the Python ecosystem is something we’ll have to live with; at least for now. So, having to write things that work with both sync and async code is an inevitable part of the journey. Projects like Starlette1, HTTPx2 can give you some clever pointers on how to craft APIs that are compatible with both sync and async code. ...

Go Rusty with exception handling in Python

While grokking Black formatter’s codebase, I came across this1 interesting way of handling exceptions in Python. Exception handling in Python usually follows the EAFP paradigm where it’s easier to ask for forgiveness than permission. However, Rust has this recoverable error2 handling workflow that leverages generic Enums. I wanted to explore how Black emulates that in Python. This is how it works: # src.py from __future__ import annotations from typing import Generic, TypeVar, Union T = TypeVar("T") E = TypeVar("E", bound=Exception) class Ok(Generic[T]): def __init__(self, value: T) -> None: self._value = value def ok(self) -> T: return self._value class Err(Generic[E]): def __init__(self, e: E) -> None: self._e = e def err(self) -> E: return self._e Result = Union[Ok[T], Err[E]] In the above snippet, two generic types Ok and Err represent the return type and the error types of a callable respectively. These two generics were then combined into one Result generic type. You’d use the Result generic to handle exceptions as follows: ...

Variance of generic types in Python

I’ve always had a hard time explaining variance of generic types while working with type annotations in Python. This is an attempt to distill the things I’ve picked up on type variance while going through PEP-483. A pinch of type theory A generic type is a class or interface that is parameterized over types. Variance refers to how subtyping between the generic types relates to subtyping between their parameters' types. ...

Create a sub dictionary with O(K) complexity in Python

How’d you create a sub dictionary from a dictionary where the keys of the sub-dict are provided as a list? I was reading a tweet1 by Ned Bachelder on this today and that made me realize that I usually solve it with O(DK) complexity, where K is the length of the sub-dict keys and D is the length of the primary dict. Here’s how I usually do that without giving it any thoughts or whatsoever: ...

Gotchas of early-bound function argument defaults in Python

I was reading a tweet about it yesterday and that didn’t stop me from pushing a code change in production with the same rookie mistake today. Consider this function: # src.py from __future__ import annotations import logging import time from datetime import datetime def log( message: str, /, *, level: str, timestamp: str = datetime.utcnow().isoformat(), ) -> None: logger = getattr(logging, level) # Avoid f-string in logging as it's not lazy. logger("Timestamp: %s \nMessage: %s\n" % (timestamp, message)) if __name__ == "__main__": for _ in range(3): time.sleep(1) log("Reality can often be disappointing.", level="warning") Here, the function log has a parameter timestamp that computes its default value using the built-in datetime.utcnow().isoformat() method. I was under the impression that the timestamp parameter would be computed each time when the log function was called. However, that’s not what happens when you try to run it. If you run the above snippet, you’ll get this instead: ...

Use 'assertIs' to check literal booleans in Python unittest

I used to use Unittest’s self.assertTrue / self.assertFalse to check both literal booleans and truthy/falsy values in Unittest. Committed the same sin while writing tests in Django. I feel like assertTrue and assertFalse are misnomers. They don’t specifically check literal booleans, only truthy and falsy states respectively. Consider this example: # src.py import unittest class TestFoo(unittest.TestCase): def setUp(self): self.true_literal = True self.false_literal = False self.truthy = [True] self.falsy = [] def is_true(self): self.assertTrue(self.true_literal, True) def is_false(self): self.assertFalse(self.false_literal, True) def is_truthy(self): self.assertTrue(self.truthy, True) def is_falsy(self): self.assertFalse(self.falsy, True) if __name__ == "__main__": unittest.main() In the above snippet, I’ve used assertTrue and assertFalse to check both literal booleans and truthy/falsy values. However, to test the literal boolean values, assertIs works better and is more explicit. Here’s how to do the above test properly: ...

Static typing Python decorators

Accurately static typing decorators in Python is an icky business. The wrapper function obfuscates type information required to statically determine the types of the parameters and the return values of the wrapped function. Let’s write a decorator that registers the decorated functions in a global dictionary during function definition time. Here’s how I used to annotate it: # src.py # Import 'Callable' from 'typing' module in < Py3.9. from collections.abc import Callable from functools import wraps from typing import Any, TypeVar R = TypeVar("R") funcs = {} def register(func: Callable[..., R]) -> Callable[..., R]: """Register any function at definition time in the 'funcs' dict.""" # Registers the function during function defition time. funcs[func.__name__] = func @wraps(func) def inner(*args: Any, **kwargs: Any) -> Any: return func(*args, **kwargs) return inner @register def hello(name: str) -> str: return f"Hello {name}!" The functools.wraps decorator makes sure that the identity and the docstring of the wrapped function don’t get gobbled up by the decorator. This is syntactically correct and if you run Mypy against the code snippet, it’ll happily tell you that everything’s alright. However, this doesn’t exactly do anything. If you call the hello function with the wrong type of parameter, Mypy won’t be able to detect the mistake statically. Notice this: ...

Inspect docstrings with Pydoc

How come I didn’t know about the python -m pydoc command before today! It lets you inspect the docstrings of any modules, classes, functions, or methods in Python. I’m running the commands from a Python 3.10 virtual environment but it’ll work on any Python version. Let’s print out the docstrings of the functools.lru_cache function. Run: python -m pydoc functools.lru_cache This will print the following on the console: Help on function lru_cache in functools: functools.lru_cache = lru_cache(maxsize=128, typed=False) Least-recently-used cache decorator. If *maxsize* is set to None, the LRU features are disabled and the cache can grow without bound. If *typed* is True, arguments of different types will be cached separately. For example, f(3.0) and f(3) will be treated as distinct calls with distinct results. Arguments to the cached function must be hashable. View the cache statistics named tuple (hits, misses, maxsize, currsize) with f.cache_info(). Clear the cache and statistics with f.cache_clear(). Access the underlying function with f.__wrapped__. Works for third party tools as well: ...

Check whether an integer is a power of two in Python

To check whether an integer is a power of two, I’ve deployed hacks like this: def is_power_of_two(x: int) -> bool: return x > 0 and hex(x)[-1] in ("0", "2", "4", "8") While this works1, I’ve never liked explaining the pattern matching hack that’s going on here. Today, I came across this tweet2 by Raymond Hettinger where he proposed an elegant solution to the problem. Here’s how it goes: def is_power_of_two(x: int) -> bool: return x > 0 and x.bit_count() == 1 This is neat as there’s no hack and it uses a mathematical invariant to check whether an integer is a power of 2 or not. Also, it’s a tad bit faster. ...

Uniform error response in Django Rest Framework

Django Rest Framework exposes a neat hook to customize the response payload of your API when errors occur. I was going through Microsoft’s REST API guideline1 and wanted to make the error response of my APIs more uniform and somewhat similar to this2. I’ll use a modified version of the quickstart example3 in the DRF docs to show how to achieve that. Also, we’ll need a POST API to demonstrate the changes better. Here’s the same example with the added POST API. Place this code in the project’s urls.py file. ...

Difference between constrained 'TypeVar' and 'Union' in Python

If you want to define a variable that can accept values of multiple possible types, using typing.Union is one way of doing that: from typing import Union U = Union[int, str] However, there’s another way you can express a similar concept via constrained TypeVar. You’d do so as follows: from typing import TypeVar T = TypeVar("T", int, str) So, what’s the difference between these two and when to use which? The primary difference is: T’s type needs to be consistent across multiple uses within a given scope, while U’s doesn’t. ...