python-pkgs-build-lambda

Building large python packages for AWS Lambda

This repo contains a build.sh script that's intended to be run in an Amazon Linux docker container, and build numpy, pandas, and scipy for use in AWS Lambda. For more info about how the script works, and how to use it, see this blog post on deploying sklearn to Lambda.

There was an older version of this repo, now archived in the ec2-build-process branch, used an EC2 instance to perform the build process and an Ansible playbook to execute the build. That version still works, but the new dockerized version doesn't require you to launch a remote instance.

To build the zipfile, pull the Amazon Linux image and run the build script in it.

$ docker pull amazonlinux:2017.09
$ docker run -v $(pwd):/outputs -it amazonlinux:2017.09  /bin/bash /outputs/build.sh

Note that the script no longer works with the amazonlinux:latest image so use the one specified above amazonlinux:2017.09.

That will make a file called venv.zip in the local directory that's around 67 MB.

Once you run this, you'll have a zipfile containing scipy, pandas, and numpy, to use them add your handler file to the zip, and add the lib directory so it can be used for shared libs. The minimum viable scipy handler would thus look like:

import os
import ctypes

for d, _, files in os.walk('lib'):
    for f in files:
        if f.endswith('.a'):
            continue
        ctypes.cdll.LoadLibrary(os.path.join(d, f))

import scipy

def handler(event, context):
    # do scipy stuff here
    return {'yay': 'done'}

Extra Packages

To add extra packages to the build, create a requirements.txt file alongside the build.sh in this repo. All packages listed there will be installed in addition to scipy, pandas, numpy, and related dependencies.

Changes Made

This script was edited to allow us to import our private repos such as amper-core. This repository must include private_key.txt file including a ssh private key, and here is info on how to generate a ssh-key and how to add it to you github profile. The corresponding private_key should be placed in the private_key.txt file where it will be used by docker to access our private repo.

Also the build.sh script was updated to use python 3.6 and install git so that we can install packages in requirements.txt.

Sizing and Future Work

With just compression and stripped binaries, the full sklearn stack weighs in at 65 MB, and could probably be reduced further by:

Pre-compiling all .pyc files and deleting their source
Removing test files
Removing documentation

For my purposes, 39 MB is sufficiently small, if you have any improvements to share pull requests or issues are welcome.

Completed optimizations 2 and 3 above by following this link. Tried optimization 1 but packages broke down.

License

This project is MIT Licensed, for license info on the numpy, scipy, and sklearn packages see their respective sites. Full text of the MIT license is in LICENSE.txt.

Created Deployment Packages

venv-3-1-18.zip - package for state-gen lambda function in covfefe (numpy, scipy, requirements)
venv-4-23-18.zip - package for auto tuner (no longer used)
venv-5-32-18.zip - package for cycles-gen lambda function in covfefe (numpy, scipy, pandas, requirements)
venv-8-2-18.zip - package for cycles-gen lambda function in covfefe (numpy, scipy, pandas, scikit-learn, requirements)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
build.sh		build.sh
full-venv.zip		full-venv.zip
requirements.txt		requirements.txt
test_pkgs_handler.py		test_pkgs_handler.py
venv-3-1-18.zip		venv-3-1-18.zip
venv-4-23-18.zip		venv-4-23-18.zip
venv-5-31-18.zip		venv-5-31-18.zip
venv-8-2-18.zip		venv-8-2-18.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python-pkgs-build-lambda

Building large python packages for AWS Lambda

Extra Packages

Changes Made

Sizing and Future Work

License

Created Deployment Packages

About

Releases

Packages

Languages

License

Ashwin-Chhetri/python-pkgs-build-lambda

Folders and files

Latest commit

History

Repository files navigation

python-pkgs-build-lambda

Building large python packages for AWS Lambda

Extra Packages

Changes Made

Sizing and Future Work

License

Created Deployment Packages

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages