API | Redowan's Reflections

Disallow large file download from URLs in Python

March 23, 2022 · 2 min

I was working on a DRF POST API endpoint where the consumer is expected to add a URL containing a PDF file and the system would then download the file and save it to an S3 bucket. While this sounds quite straightforward, there’s one big issue. Before I started working on it, the core logic looked like this:

# src.py
from __future__ import annoatations

from urllib.request import urlopen
import tempfile
from shutil import copyfileobj


def save_to_s3(src_url: str, dest_url: str) -> None:
    with tempfile.NamedTemporaryFile() as file:
        with urlopen(src_url) as response:
            # This stdlib function saves the content of the file
            # in 'file'.
            copyfileobj(response, file)

        # Logic to save file in s3.
        _save_to_s3(des_url)


if __name__ == "__main__":
    save_to_s3(
        "https://citeseerx.ist.psu.edu/viewdoc/download?"
        "doi=10.1.1.92.4846&rep=rep1&type=pdf",
        "https://s3-url.com",
    )

In the above snippet, there’s no guardrail against how large the target file can be. You could bring the entire server down to its knees by posting a link to a ginormous file. The server would be busy downloading the file and keep consuming resources.

Python API Security

Declarative payloads with TypedDict in Python

March 11, 2022 · 6 min

While working with microservices in Python, a common pattern that I see is - the usage of dynamically filled dictionaries as payloads of REST APIs or message queues. To understand what I mean by this, consider the following example:

# src.py
from __future__ import annotations

import json
from typing import Any

import redis  # Do a pip install.


def get_payload() -> dict[str, Any]:
    """Get the 'zoo' payload containing animal names and attributes."""

    payload = {"name": "awesome_zoo", "animals": []}

    names = ("wolf", "snake", "ostrich")
    attributes = (
        {"family": "Canidae", "genus": "Canis", "is_mammal": True},
        {"family": "Viperidae", "genus": "Boas", "is_mammal": False},
    )
    for name, attr in zip(names, attributes):
        payload["animals"].append(  # type: ignore
            {"name": name, "attribute": attr},
        )
    return payload


def save_to_cache(payload: dict[str, Any]) -> None:
    # You'll need to spin up a Redis db before instantiating
    # a connection here.
    r = redis.Redis()
    print("Saving to cache...")
    r.set(f"zoo:{payload['name']}", json.dumps(payload))


if __name__ == "__main__":
    payload = get_payload()
    save_to_cache(payload)

Here, the get_payload function constructs a payload that gets stored in a Redis DB in the save_to_cache function. The get_payload function returns a dict that denotes a contrived payload containing the data of an imaginary zoo. To execute the above snippet, you’ll need to spin up a Redis database first. You can use Docker to do so. Install and configure Docker on your system and run:

Python Typing API

Uniform error response in Django Rest Framework

January 20, 2022 · 3 min

Django Rest Framework exposes a neat hook to customize the response payload of your API when errors occur. I was going through Microsoft’s REST API guideline and wanted to make the error response of my APIs more uniform and somewhat similar to this example.

I’ll use a modified version of the quickstart example in the DRF docs to show how to achieve that. Also, we’ll need a POST API to demonstrate the changes better. Here’s the same example with the added POST API. Place this code in the project’s urls.py file.

Python Django TIL API

Effortless API response caching with Python & Redis

May 25, 2020 · 9 min

Updated on 2023-09-11: Fix broken URLs.

Recently, I was working with Mapbox’s Route optimization API. It tries to solve the traveling salesman problem where you provide the API with coordinates of multiple places and it returns a duration-optimized route between those locations. This is a perfect usecase where Redis caching can come handy. Redis is a fast and lightweight in-memory database with additional persistence options; making it a perfect candidate for the task at hand. Here, caching can save you from making redundant API requests and also, it can dramatically improve the response time as well.

Python API Redis