Switching between multiple data streams in a single thread

· 4 min

I was working on a project where I needed to poll multiple data sources and consume the incoming data points in a single thread. In this particular case, the two data streams were coming from two different Redis lists. The correct way to consume them would be to write two separate consumers and spin them up as different processes.

However, in this scenario, I needed a simple way to poll and consume data from one data source, wait for a bit, then poll and consume from another data source, and keep doing this indefinitely. That way I could get away with doing the whole workflow in a single thread without the overhead of managing multiple processes.

Skipping the first part of an iterable in Python

· 3 min

Consider this iterable:

it = (1, 2, 3, 0, 4, 5, 6, 7)

Let’s say you want to build another iterable that includes only the numbers that appear starting from the element 0. Usually, I’d do this:

# This returns (0, 4, 5, 6, 7).
from_zero = tuple(elem for idx, elem in enumerate(it) if idx >= it.index(0))

While this is quite terse and does the job, it won’t work with a generator. There’s an even more generic and terser way to do the same thing with itertools.dropwhile function. Here’s how to do it:

Colon command in shell scripts

· 2 min

The colon : command is a shell utility that represents a truthy value. It can be thought of as an alias for the built-in true command. You can test it by opening a shell script and typing a colon on the command line, like this:

:

If you then inspect the exit code by typing $? on the command line, you’ll see a 0 there, which is exactly what you’d see if you had used the true command.

Installing Python on macOS with asdf

· 3 min

I’ve just migrated from Ubuntu to macOS for work and am still in the process of setting up the machine. I’ve been a lifelong Linux user and this is the first time I’ve picked up an OS that’s not just another flavor of Debian. Primarily, I work with Python, NodeJS, and a tiny bit of Go. Previously, any time I had to install these language runtimes, I’d execute a bespoke script that’d install:

Save models with update_fields for better performance in Django

· 3 min

TIL that you can specify update_fields while saving a Django model to generate a leaner underlying SQL query. This yields better performance while updating multiple objects in a tight loop. To test that, I’m opening an IPython shell with python manage.py shell -i ipython command and creating a few user objects with the following lines:

In [1]: from django.contrib.auth import User

In [2]: for i in range(1000):
   ...:     fname, lname = f'foo_{i}', f'bar_{i}'
   ...:     User.objects.create(
   ...:         first_name=fname, last_name=lname, username=fname)
   ...:

Here’s the underlying query Django generates when you’re trying to save a single object:

Returning values from a shell function

· 3 min

TIL that returning a value from a function in bash doesn’t do what I thought it does. Whenever you call a function that’s returning some value, instead of giving you the value, Bash sets the return value of the callee as the status code of the calling command. Consider this example:

#!/usr/bin/bash
# script.sh

return_42() {
    return 42
}

# Call the function and set the return value to a variable.
value=$return_42

# Print the return value.
echo $value

I was expecting this to print out 42 but instead it doesn’t print anything to the console. Turns out, a shell function doesn’t return the value when it encounters the return keyword. Rather, it stops the execution of the function and sets the status code of the last command in the function as the value that the function returns.

Compose multiple levels of fixtures in pytest

· 4 min

While reading the second version of Brian Okken’s pytest book, I came across this neat trick to compose multiple levels of fixtures. Suppose, you want to create a fixture that returns some canned data from a database. Now, let’s say that invoking the fixture multiple times is expensive, and to avoid that you want to run it only once per test session. However, you still want to clear all the database states after each test function runs. Otherwise, a test might inadvertently get coupled with another test that runs before it via the fixture’s shared state. Let’s demonstrate this:

When to use 'git pull --rebase'

· 2 min

Whenever your local branch diverges from the remote branch, you can’t directly pull from the remote branch and merge it into the local branch. This can happen when, for example:

  • You checkout from the main branch to work on a feature in a branch named alice.
  • When you’re done, you merge alice into main.
  • After that, if you try to pull the main branch from remote again and the content of the main branch changes by this time, you’ll encounter a merge error.

Reproduce the issue

Create a new branch named alice from main. Run:

Distil git logs attached to a single file

· 2 min

I run git log --oneline to list out the commit logs all the time. It prints out a compact view of the git history. Running the command in this repo gives me this:

d9fad76 Publish blog on safer operator.itemgetter, closes #130
0570997 Merge pull request #129 from rednafi/dependabot/...
6967f73 Bump actions/setup-python from 3 to 4
48c8634 Merge pull request #128 from rednafi/dependabot/pip/mypy-0.961
5b7a7b0 Bump mypy from 0.960 to 0.961

However, there are times when I need to list out the commit logs that only represent the changes made to a particular file. Here’s the command that does exactly that.

Health check a server with 'nohup $(cmd) &'

· 2 min

While working on a project with EdgeDB and FastAPI, I wanted to perform health checks against the FastAPI server in the GitHub CI. This would notify me about the working state of the application. The idea is to:

  • Run the server in the background.
  • Run the commands against the server that’ll denote that the app is in a working state.
  • Perform cleanup.
  • Exit with code 0 if the check is successful, else exit with code 1.

The following shell script demonstrates a similar workflow with a Python HTTP server. This script: