Networking

Writing a circuit breaker in Go

Besides retries, circuit breakers are probably one of the most commonly employed resilience patterns in distributed systems. While writing a retry routine is pretty simple, implementing a circuit breaker needs a little bit of work. I realized that I usually just go for off-the-shelf libraries for circuit breaking and haven’t written one from scratch before. So, this is an attempt to create a sloppy one in Go. I picked Go instead of Python because I didn’t want to deal with sync-async idiosyncrasies or abstract things away under a soup of decorators. ...

Notes on building event-driven systems

I spent the evening watching this incredibly grokkable talk on event-driven services by James Eastham at NDC London 2024. Below is a cleaned-up version of my notes. I highly recommend watching the full talk if you’re interested before reading this distillation. The curse of tightly coupled microservices Microservices often start with HTTP-based request-response communication, which seems straightforward but quickly becomes a pain as systems grow. Coupling—where one service depends on another—creates a few issues. Take the order processing service in a fictional Plant-Based Pizza company. It has to talk to the pickup service, delivery service, kitchen, and loyalty point service. They’re all tied together, so if one fails, the whole system could go down. ...

Protobuffed contracts

People typically associate Google’s Protocol Buffer1 with gRPC2 services, and rightfully so. But things often get confusing when discussing protobufs because the term can mean different things: A binary protocol for efficiently serializing structured data. A language used to specify how this data should be structured. In gRPC services, you usually use both: the protobuf language in proto files defines the service interface, and then the clients use the same proto files to communicate with the services. ...

Crossing the CORS crossroad

Every once in a while, I find myself skimming through the MDN docs to jog my memory on how CORS1 works and which HTTP headers are associated with it. This is particularly true when a frontend app can’t talk to a backend service I manage due to a CORS error2. MDN’s CORS documentation is excellent but can be a bit verbose for someone just looking for a way to quickly troubleshoot and fix the issue at hand. ...

Rate limiting via Nginx

I needed to integrate rate limiting into a relatively small service that complements a monolith I was working on. My initial thought was to apply it at the application layer, as it seemed to be the simplest route. Plus, I didn’t want to muck around with load balancer configurations, and there’s no shortage of libraries that allow me to do this quickly in the app. However, this turned out to be a bad idea. In the event of a DDoS or thundering herd incident, even if the app rejects the influx of inbound requests, the app server workers still have to do a minimal amount of work. ...

Using DNS record to share text data

This morning, while browsing Hacker News, I came across a neat trick1 that allows you to share textual data by leveraging DNS TXT records. It can be useful for sharing a small amount of data in environments that restrict IP but allow DNS queries, or to bypass censorship. To test this out, I opened my domain registrar’s panel and created a new TXT type DNS entry with a base64 encoded message containing the poem A Poison Tree by William Blake. The message can now be queried and decoded with the following shell command: ...

Implementing a simple traceroute clone in Python

I was watching this amazing lightning talk1 by Karla Burnett and wanted to understand how traceroute works in Unix. Traceroute is a tool that shows the route of a network packet from your computer to another computer on the internet. It also tells you how long it takes for the packet to reach each stop along the way. It’s useful when you want to know more about how your computer connects to other computers on the internet. For example, if you want to visit a website, your computer sends a request to the website’s server, which is another computer that hosts the website. But the request doesn’t go directly from your computer to the server. It has to pass through several other devices, such as routers, that help direct the traffic on the internet. These devices are called hops. Traceroute shows you the list of hops that your request goes through, and how long it takes for each hop to respond. This can help you troubleshoot network problems, such as slow connections or unreachable websites. ...

Building a CORS proxy with Cloudflare Workers

Cloudflare absolutely nailed the serverless function DX with Cloudflare Workers1. However, I feel like it’s yet to receive widespread popularity like AWS Lambda since as of now, the service only offers a single runtime—JavaScript. But if you can look past that big folly, it’s a delightful piece of tech to work with. I’ve been building small tools with it for a couple of years but never got around to writing about the immense productivity boost it usually gives me whenever I need to quickly build and deploy a self-contained service. ...

Fixed-time job scheduling with UNIX 'at' command

This weekend, I was working on a fun project that required a fixed-time job scheduler to run a curl command at a future timestamp. I was aiming to find the simplest solution that could just get the job done. I’ve also been exploring Google Bard1 recently and wanted to see how it stacks up against other LLM tools like ChatGPT, BingChat, or Anthropic’s Claude in terms of resolving programming queries. ...

Pushing real-time updates to clients with Server-Sent Events (SSEs)

In multi-page web applications, a common workflow is where a user: Loads a specific page or clicks on some button that triggers a long-running task. On the server side, a background worker picks up the task and starts processing it asynchronously. The page shouldn’t reload while the task is running. The backend then communicates the status of the long-running task in real-time. Once the task is finished, the client needs to display a success or an error message depending on the final status of the finished task. The de facto tool for handling situations where real-time bidirectional communication is necessary is WebSocket1. However, in the case above, you can see that the communication is mostly unidirectional where the client initiates some action in the server and then the server continuously pushes data to the client during the lifespan of the background job. ...

Tinkering with Unix domain sockets

I’ve always had a vague idea about what Unix domain sockets are from my experience working with Docker for the past couple of years. However, lately, I’m spending more time in embedded edge environments and had to explore Unix domain sockets in a bit more detail. This is a rough documentation of what I’ve explored to gain some insights. The dry definition Unix domain sockets (UDS) are similar to TCP sockets in a way that they allow two processes to communicate with each other, but there are some core differences. While TCP sockets are used for communication over a network, Unix domain sockets are used for communication between processes running on the same computer. ...

Signal handling in a multithreaded socket server

While working on a multithreaded socket server in an embedded environment, I realized that the default behavior of Python’s socketserver.ThreadingTCPServer requires some extra work if you want to shut down the server gracefully in the presence of an interruption signal. The intended behavior here is that whenever any of SIGHUP, SIGINT, SIGTERM, or SIGQUIT signals are sent to the server, it should: Acknowledge the signal and log a message to the output console of the server. Notify all the connected clients that the server is going offline. Give the clients enough time (specified by a timeout parameter) to close the requests. Close all the client requests and then shut down the server after the timeout exceeds. Here’s a quick implementation of a multithreaded echo server and see what happens when you send SIGINT to shut down the server: ...

Pausing and resuming a socket server in Python

I needed to write a socket server in Python that would allow me to intermittently pause the server loop for a while, run something else, then get back to the previous request-handling phase; repeating this iteration until the heat death of the universe. Initially, I opted for the low-level socket module to write something quick and dirty. However, the implementation got hairy pretty quickly. While the socket module gives you plenty of control over how you can tune the server’s behavior, writing a server with robust signal and error handling can be quite a bit of boilerplate work. ...

Stream process a CSV file in Python

A common bottleneck for processing large data files is—memory. Downloading the file and loading the entire content is surely the easiest way to go. However, it’s likely that you’ll quickly hit OOM errors. Often time, whenever I have to deal with large data files that need to be downloaded and processed, I prefer to stream the content line by line and use multiple processes to consume them concurrently. For example, say, you have a CSV file containing millions of rows with the following structure: ...