Caching connection objects in Python

· 2 min

To avoid instantiating multiple DB connections in Python apps, a common approach is to initialize the connection objects in a module once and then import them everywhere. So, you’d do this:

# src.py
import boto3  # Pip install boto3
import redis  # Pip install redis

dynamo_client = boto3.client("dynamodb")
redis_client = redis.Redis()

However, this adds import time side effects to your module and can turn out to be expensive. In search of a better solution, my first instinct was to go for functools.lru_cache(None) to immortalize the connection objects in memory. It works like this:

Create a sub dictionary with O(K) complexity in Python

· 3 min

How’d you create a sub dictionary from a dictionary where the keys of the sub-dict are provided as a list?

I was reading a tweet by Ned Bachelder on this today and that made me realize that I usually solve it with O(DK) complexity, where K is the length of the sub-dict keys and D is the length of the primary dict. Here’s how I usually do that without giving it any thoughts or whatsoever:

Use 'assertIs' to check literal booleans in Python unittest

· 1 min

I used to use Unittest’s self.assertTrue / self.assertFalse to check both literal booleans and truthy/falsy values in Unittest. Committed the same sin while writing tests in Django.

I feel like assertTrue and assertFalse are misnomers. They don’t specifically check literal booleans, only truthy and falsy states respectively.

Consider this example:

# src.py
import unittest


class TestFoo(unittest.TestCase):
    def setUp(self):
        self.true_literal = True
        self.false_literal = False
        self.truthy = [True]
        self.falsy = []

    def is_true(self):
        self.assertTrue(self.true_literal, True)

    def is_false(self):
        self.assertFalse(self.false_literal, True)

    def is_truthy(self):
        self.assertTrue(self.truthy, True)

    def is_falsy(self):
        self.assertFalse(self.falsy, True)


if __name__ == "__main__":
    unittest.main()

In the above snippet, I’ve used assertTrue and assertFalse to check both literal booleans and truthy/falsy values. However, to test the literal boolean values, assertIs works better and is more explicit. Here’s how to do the above test properly:

Check whether an integer is a power of two in Python

· 2 min

To check whether an integer is a power of two, I’ve deployed hacks like this:

def is_power_of_two(x: int) -> bool:
    return x > 0 and hex(x)[-1] in ("0", "2", "4", "8")

While this hex trick works, I’ve never liked explaining the pattern matching hack that’s going on here.

Today, I came across this tweet by Raymond Hettinger where he proposed an elegant solution to the problem. Here’s how it goes:

def is_power_of_two(x: int) -> bool:
    return x > 0 and x.bit_count() == 1

This is neat as there’s no hack and it uses a mathematical invariant to check whether an integer is a power of 2 or not. Also, it’s a tad bit faster.

Uniform error response in Django Rest Framework

· 3 min

Django Rest Framework exposes a neat hook to customize the response payload of your API when errors occur. I was going through Microsoft’s REST API guideline and wanted to make the error response of my APIs more uniform and somewhat similar to this example.

I’ll use a modified version of the quickstart example in the DRF docs to show how to achieve that. Also, we’ll need a POST API to demonstrate the changes better. Here’s the same example with the added POST API. Place this code in the project’s urls.py file.

Difference between constrained 'TypeVar' and 'Union' in Python

· 2 min

If you want to define a variable that can accept values of multiple possible types, using typing.Union is one way of doing that:

from typing import Union

U = Union[int, str]

However, there’s another way you can express a similar concept via constrained TypeVar. You’d do so as follows:

from typing import TypeVar

T = TypeVar("T", int, str)

So, what’s the difference between these two and when to use which? The primary difference is:

T’s type needs to be consistent across multiple uses within a given scope, while U’s doesn’t.

Don't wrap instance methods with 'functools.lru_cache' decorator in Python

· 6 min

Recently, fell into this trap as I wanted to speed up a slow instance method by caching it.

When you decorate an instance method with functools.lru_cache decorator, the instances of the class encapsulating that method never get garbage collected within the lifetime of the process holding them.

Let’s consider this example:

# src.py
import functools
import time
from typing import TypeVar

Number = TypeVar("Number", int, float, complex)


class SlowAdder:
    def __init__(self, delay: int = 1) -> None:
        self.delay = delay

    @functools.lru_cache
    def calculate(self, *args: Number) -> Number:
        time.sleep(self.delay)
        return sum(args)

    def __del__(self) -> None:
        print("Deleting instance ...")


# Create a SlowAdder instance.
slow_adder = SlowAdder(2)

# Measure performance.
start_time = time.perf_counter()
# ----------------------------------------------
result = slow_adder.calculate(1, 2)
# ----------------------------------------------
end_time = time.perf_counter()
print(f"Calculation took {end_time-start_time} seconds, result: {result}.")


start_time = time.perf_counter()
# ----------------------------------------------
result = slow_adder.calculate(1, 2)
# ----------------------------------------------
end_time = time.perf_counter()
print(f"Calculation took {end_time-start_time} seconds, result: {result}.")

Here, I’ve created a simple SlowAdder class that accepts a delay value; then it sleeps for delay seconds and calculates the sum of the inputs in the calculate method. To avoid this slow recalculation for the same arguments, the calculate method was wrapped in the lru_cache decorator. The __del__ method notifies us when the garbage collection has successfully cleaned up instances of the class.

Cropping texts in Python with 'textwrap.shorten'

· 3 min

Problem

A common interview question that I’ve seen goes as follows:

Write a function to crop a text corpus without breaking any word.

  • Take the length of the text up to which character you should trim.
  • Make sure that the cropped text doesn’t have any trailing space.
  • Try to maximize the number of words you can pack in your trimmed text.

Your function should look something like this:

def crop(text: str, limit: int) -> str:
    """Crops 'text' upto 'limit' characters."""

    # Crop the text.
    cropped_text = perform_crop()
    return cropped_text

For example, if text looks like this:

Automatic attribute delegation in Python composition

· 3 min

While trying to avoid inheritance in an API that I was working on, I came across this neat trick to perform attribute delegation on composed classes. Let’s say there’s a class called Engine and you want to put an engine instance in a Car. In this case, the car has a classic ‘has a’ (inheritance usually refers to ‘is a’ relationships) relationship with the engine. So, composition makes more sense than inheritance here. Consider this example:

Don't add extensions to shell executables

· 1 min

I was browsing through the source code of Tom Christie’s typesystem library and discovered that the shell scripts of the project don’t have any extensions attached to them. At first, I found it odd, and then it all started to make sense.

Executable scripts can be written in any language and the users don’t need to care about that.

GitHub uses this scripts-to-rule-them-all pattern successfully to normalize their scripts. According to the pattern, every project should have a folder named scripts with a subset or superset of the following files: