Redowan's Reflections

Escaping the template pattern hellscape in Python

Over the years, I’ve used the template pattern1 across multiple OO languages with varying degrees of success. It was one of the first patterns I learned in the primordial hours of my software engineering career, and for some reason, it just feels like the natural way to tackle many real-world code-sharing problems. Yet, even before I jumped on board with the composition over inheritance2 camp, I couldn’t help but notice how using this particular inheritance technique spawns all sorts of design and maintenance headaches as the codebase starts to grow. ...

Python dependency management redux

One major drawback of Python’s huge ecosystem is the significant variances in workflows among people trying to accomplish different things. This holds true for dependency management as well. Depending on what you’re doing with Python—whether it’s building reusable libraries, writing web apps, or diving into data science and machine learning—your workflow can look completely different from someone else’s. That being said, my usual approach to any development process is to pick a method and give it a shot to see if it works for my specific needs. Once a process works, I usually automate it and rarely revisit it unless something breaks. ...

Implementing a simple traceroute clone in Python

I was watching this amazing lightning talk1 by Karla Burnett and wanted to understand how traceroute works in Unix. Traceroute is a tool that shows the route of a network packet from your computer to another computer on the internet. It also tells you how long it takes for the packet to reach each stop along the way. It’s useful when you want to know more about how your computer connects to other computers on the internet. For example, if you want to visit a website, your computer sends a request to the website’s server, which is another computer that hosts the website. But the request doesn’t go directly from your computer to the server. It has to pass through several other devices, such as routers, that help direct the traffic on the internet. These devices are called hops. Traceroute shows you the list of hops that your request goes through, and how long it takes for each hop to respond. This can help you troubleshoot network problems, such as slow connections or unreachable websites. ...

Bulk request Google search indexing with API

Recently, I purchased a domain for this blog and migrated the content from rednafi.github.io1 to rednafi.com2. This turned out to be a much bigger hassle than I originally thought it’d be, mostly because, despite setting redirection for almost all the URLs from the previous domain to the new one and submitting the new sitemap.xml3 to the Search Console, Google kept indexing the older domain. To make things worse, the search engine selected the previous domain as canonical, and no amount of manual requests were changing the status in the last 30 days. Strangely, I didn’t encounter this issue with Bing, as it reindexed the new site within a week after I submitted the sitemap file via their webmaster panel. ...

Building a CORS proxy with Cloudflare Workers

Cloudflare absolutely nailed the serverless function DX with Cloudflare Workers1. However, I feel like it’s yet to receive widespread popularity like AWS Lambda since as of now, the service only offers a single runtime—JavaScript. But if you can look past that big folly, it’s a delightful piece of tech to work with. I’ve been building small tools with it for a couple of years but never got around to writing about the immense productivity boost it usually gives me whenever I need to quickly build and deploy a self-contained service. ...

Fixed-time job scheduling with UNIX 'at' command

This weekend, I was working on a fun project that required a fixed-time job scheduler to run a curl command at a future timestamp. I was aiming to find the simplest solution that could just get the job done. I’ve also been exploring Google Bard1 recently and wanted to see how it stacks up against other LLM tools like ChatGPT, BingChat, or Anthropic’s Claude in terms of resolving programming queries. ...

Sorting a Django queryset by a custom sequence of an attribute

I needed a way to sort a Django queryset based on a custom sequence of an attribute. Typically, Django allows sorting a queryset by any attribute on the model or related to it in either ascending or descending order. However, what if you need to sort the queryset following a custom sequence of attribute values? Suppose, you’re working with a model called Product where you want to sort the rows of the table based on a list of product ids that are already sorted in a particular order. Here’s how it might look: ...

Periodic readme updates with GitHub Actions

I recently gave my blog1 a fresh new look and decided it was time to spruce up my GitHub profile’s2 landing page as well. GitHub has a special3 way of treating the README.md file of your repo, displaying its content as the landing page for your profile. My goal was to showcase a brief introduction about myself and my work, along with a list of the five most recent articles on my blog. Additionally, I wanted to ensure that the article list stayed up to date. ...

Associative arrays in Bash

One of my favorite pastimes these days is to set BingChat to creative mode, ask it to teach me a trick about topic X, and then write a short blog post about it to reinforce my understanding. Some of the things it comes up with are absolutely delightful. In the spirit of that, I asked it to teach me a Shell trick that I can use to mimic maps or dictionaries in a shell environment. I didn’t even know what I was expecting. ...

Deduplicating iterables while preserving order in Python

Whenever I need to deduplicate the items of an iterable in Python, my usual approach is to create a set from the iterable and then convert it back into a list or tuple. However, this approach doesn’t preserve the original order of the items, which can be a problem if you need to keep the order unscathed. Here’s a naive approach that works: from __future__ import annotations from collections.abc import Iterable # Python >3.9 def dedup(it: Iterable) -> list: seen = set() result = [] for item in it: if item not in seen: seen.add(item) result.append(item) return result it = (2, 1, 3, 4, 66, 0, 1, 1, 1) deduped_it = dedup(it) # Gives you [2, 1, 3, 4, 66, 0] This code snippet defines a function dedup that takes an iterable it as input and returns a new list containing the unique items of the input iterable in their original order. The function uses a set seen to keep track of the items that have already been seen, and a list result to store the unique items. ...

Process substitution in Bash

I needed to compare two large directories with thousands of similarly named PDF files and find the differing filenames between them. In the first pass, this is what I did: Listed out the content of the first directory and saved it in a file: ls dir1 > dir1.txt Did the same for the second directory: ls dir2 > dir2.txt Compared the difference between the two outputs: diff dir1.txt dir2.txt This returned the name of the differing files likes this: ...

Dynamic menu with select statement in Bash

Whenever I need to whip up a quick command line tool, my go-to is usually Python. Python’s CLI solutions tend to be more robust than their Shell counterparts. However, dealing with its portability can sometimes be a hassle, especially when all you want is to distribute a simple script. That’s why while toying around with argparse to create a dynamic menu, I decided to ask ChatGPT if there’s a way to achieve the same using native shell scripting. Delightfully, it introduced me to the dead-simple select command that I probably should’ve known about years ago. But I guess better late than never! Here’s what I was trying to accomplish: ...

Simple terminal text formatting with tput

When writing shell scripts, I’d often resort to using hardcoded ANSI escape codes1 to format text, such as: #!/usr/bin/env bash BOLD="\033[1m" UNBOLD="\033[22m" FG_RED="\033[31m" BG_YELLOW="\033[43m" BG_BLUE="\033[44m" RESET="\033[0m" # Print a message in bold red text on a yellow background. echo -e "${BOLD}${FG_RED}${BG_YELLOW}This is a warning message${RESET}" # Print a message in white text on a blue background. echo -e "${BG_BLUE}This is a debug message${RESET}" This shell snippet above shows how to add text formatting and color to shell script output via ANSI escape codes. It defines a few variables that contain different escape codes for bold, unbold, foreground, and background colors. Then, we echo two log messages with different colors and formatting options. ...

Building a web app to display CSV file stats with ChatGPT & Observable

Whenever I plan to build something, I spend 90% of my time researching and figuring out the idiosyncrasies of the tools that I decide to use for the project. LLM tools like ChatGPT has helped me immensely in that regard. I’m taking on more tangential side projects because they’re no longer as time-consuming as they used to be and provide me with an immense amount of joy and learning opportunities. While LLM interfaces like ChatGPT may hallucinate, confabulate, and confidently give you misleading information, they also allow you to avoid starting from scratch when you decide to work on something. Personally, this benefits me enough to keep language models in my tool belt and use them to churn out more exploratory work at a much faster pace. ...

Pushing real-time updates to clients with Server-Sent Events (SSEs)

In multi-page web applications, a common workflow is where a user: Loads a specific page or clicks on some button that triggers a long-running task. On the server side, a background worker picks up the task and starts processing it asynchronously. The page shouldn’t reload while the task is running. The backend then communicates the status of the long-running task in real-time. Once the task is finished, the client needs to display a success or an error message depending on the final status of the finished task. The de facto tool for handling situations where real-time bidirectional communication is necessary is WebSocket1. However, in the case above, you can see that the communication is mostly unidirectional where the client initiates some action in the server and then the server continuously pushes data to the client during the lifespan of the background job. ...

Tinkering with Unix domain sockets

I’ve always had a vague idea about what Unix domain sockets are from my experience working with Docker for the past couple of years. However, lately, I’m spending more time in embedded edge environments and had to explore Unix domain sockets in a bit more detail. This is a rough documentation of what I’ve explored to gain some insights. The dry definition Unix domain sockets (UDS) are similar to TCP sockets in a way that they allow two processes to communicate with each other, but there are some core differences. While TCP sockets are used for communication over a network, Unix domain sockets are used for communication between processes running on the same computer. ...

Signal handling in a multithreaded socket server

While working on a multithreaded socket server in an embedded environment, I realized that the default behavior of Python’s socketserver.ThreadingTCPServer requires some extra work if you want to shut down the server gracefully in the presence of an interruption signal. The intended behavior here is that whenever any of SIGHUP, SIGINT, SIGTERM, or SIGQUIT signals are sent to the server, it should: Acknowledge the signal and log a message to the output console of the server. Notify all the connected clients that the server is going offline. Give the clients enough time (specified by a timeout parameter) to close the requests. Close all the client requests and then shut down the server after the timeout exceeds. Here’s a quick implementation of a multithreaded echo server and see what happens when you send SIGINT to shut down the server: ...

Switching between multiple data streams in a single thread

I was working on a project where I needed to poll multiple data sources and consume the incoming data points in a single thread. In this particular case, the two data streams were coming from two different Redis lists. The correct way to consume them would be to write two separate consumers and spin them up as different processes. However, in this scenario, I needed a simple way to poll and consume data from one data source, wait for a bit, then poll and consume from another data source, and keep doing this indefinitely. That way I could get away with doing the whole workflow in a single thread without the overhead of managing multiple processes. ...

Skipping the first part of an iterable in Python

Consider this iterable: it = (1, 2, 3, 0, 4, 5, 6, 7) Let’s say you want to build another iterable that includes only the numbers that appear starting from the element 0. Usually, I’d do this: # This returns (0, 4, 5, 6, 7). from_zero = tuple(elem for idx, elem in enumerate(it) if idx >= it.index(0)) While this is quite terse and does the job, it won’t work with a generator. There’s an even more generic and terser way to do the same thing with itertools.dropwhile function. Here’s how to do it: ...

Pausing and resuming a socket server in Python

I needed to write a socket server in Python that would allow me to intermittently pause the server loop for a while, run something else, then get back to the previous request-handling phase; repeating this iteration until the heat death of the universe. Initially, I opted for the low-level socket module to write something quick and dirty. However, the implementation got hairy pretty quickly. While the socket module gives you plenty of control over how you can tune the server’s behavior, writing a server with robust signal and error handling can be quite a bit of boilerplate work. ...