Python logging quirks in AWS Lambda environment

At my workplace, while working on a Lambda function, I noticed that my Python logs weren’t appearing on the corresponding Cloudwatch log dashboard. At first, I thought that the function wasn’t picking up the correct log level from the environment variables. We were using Serverless framework and GitLab CI to deploy the function, so my first line of investigation involved checking for missing environment variables in those config files. However, I quickly realized that the environment variables were being propagated to the Lambda function as expected. So, the issue had to be coming from somewhere else. After perusing through some docs, I discovered from the source code of Lambda Python Runtime Interface Client that AWS Lambda Python runtime pre-configures a logging handler that modifies the format of the log message, and also adds some metadata to the record if available. What’s not pre-configured though is the log level. This means that no matter the type of log message you try to send, it won’t print anything. ...

October 20, 2022

Read a CSV file from s3 without saving it to the disk

I frequently have to write ad-hoc scripts that download a CSV file from AWS S3, do some processing on it, and then create or update objects in the production database using the parsed information from the file. In Python, it’s trivial to download any file from s3 via boto3, and then the file can be read with the csv module from the standard library. However, these scripts are usually run from a separate script server and I prefer not to clutter the server’s disk with random CSV files. Loading the s3 file directly into memory and reading its contents isn’t difficult but the process has some subtleties. I do this often enough to justify documenting the workflow here. ...

June 26, 2022