AWS Lambda with boto3
Learn AWS Lambda and Serverless
What is Serverless Computing?
The Benefits of Serverless Computing
What is FaaS?
The Key Considerations when Switching to Serverless
What is Distributed Tracing?
What are the AWS Lambda Supported Languages?
AWS Lambda NodeJS: Building Functions
What causes an AWS Lambda cold start?
What is AWS Step Functions?
What is AWS X-Ray?
AWS Lambda Cost Guide
AWS Lambda Kotlin: Building Serverless Web Apps
AWS Lambda Timeout Best Practices
AWS Lambda boto3
AWS Lambda Edge

Background

AWS Lambda has been out there since 2015 and has become the de facto for a serverless architecture implementation. It offers low cost, less governance around scalability, concurrency and no governance for server provisioning and maintenance.

It provides a very simple service to implement Function as a Service using different languages like Python, NodeJs, Go, Java, and many more.

To make integration easy with AWS services for Python language, AWS has come up with a SDK called boto3. It enables the Python application to integrate with S3, DynamoDB, SQS and many more services. In Lambda Function, it has become very popular to talk to those services for storing, retrieving and deleting the data.

In this article, we will try to understand boto3 key features and how to use them to build a Lambda Function.

Boto3 Key Features

Resource APIs

Resource APIs provide resource objects and collections to access attributes and perform actions. It hides the low level network calls. Resources represent an object-oriented interface to AWS. It provides the resource() method of a default session and passes in a AWS service name. For example:

sqs = boto3.resource('sqs')

s3 = boto3.resource('s3')

Every resource instance has attributes and methods that are split up into identifiers, attributes, actions, references, sub-resources, and collections.

Resources can also be split into service resources (like sqs, s3, ec2, etc) and individual resources (like sqs.Queue or s3.Bucket). Service resources do not have identifiers or attributes otherwise the two share the same components.

Identifiers

An identifier is a unique value used by a resource instance to call actions.

Resources must have at least one identifier, except for the service resources (e.g. sqs or s3).

For example:

# S3 Object (bucket_name and key are identifiers)
obj = s3.Object(bucket_name='boto3', key='test.py')

Action

An action is a method that makes a service call. Actions may return a low-level response, a list of new resource instances or a new resource instance. For example:

messages = queue.receive_messages()

References

A reference is just like an attribute. It may be None or a related resource instance. The resource instance does not share identifiers with its reference resource. It is not a strict parent to child relationship. For example:

instance.subnet
instance.vpc

Sub-resources

A sub-resource is similar to a reference. The only difference is that it is a related class rather than an instance. When we instantiate sub-resources, it shares identifiers with their parent. It is a strict parent-child relationship.

queue = sqs.Queue(url='...')
message = queue.Message(receipt_handle='...')

Collections

A collection provides an iterable interface to a group of resources. A collection helps iterate over all items of a resource. For example:

sqs = boto3.resource('sqs')
for queue in sqs.queues.all():
print(queue.url)

Waiters

A waiter is similar to an action. A waiter will poll the status of a resource to check if the resource has reached a particular state. If it reaches the polled state, it will execute or else it will keep polling unless a failure occurs. For example, we can create a bucket and use a waiter to wait until it is ready to use to retrieve objects:

bucket.wait_until_exists()

Service-specific High-level Features

Boto3 comes with several other service-specific features, such as automatic multi-part transfers for Amazon S3 and simplified query conditions for DynamoDB.

Serverless Application with Lambda and Boto3

Now, we have an idea of what Boto3 is and what features it provides. Let’s build a simple serverless application with Lambda and Boto3.

The use case is when a file gets uploaded to S3 Bucket, a Lambda Function is to be triggered to read this file and store it in DynamoDB table. The architecture will look like below:

Serverless App with Lambda and Boto3

We want to use the Python language for this use case so we will take advantage of boto3 SDK to fasten our development work.

Step 1: Create a JSON File

Let’s first create a small json file with some sample customer data. This is the file we would upload to the S3 bucket. Let’s name it data.json.

#data.json

{
"customerId": "xy100",
"firstName": "Tom",
"lastName": "Alter",
"status": "active",
"isPremium": true
}

Step 2: Create S3 Bucket

Now, let’s create an S3 Bucket where the json file will be uploaded. Let’s name it boto3customer. We have created the bucket with all the default features for this example:

create s3 bucket

Step 3:  Create DynamoDB Table

Let’s create a DynamoDB table (customer) where we will upload the json file. Mark customerid as a partition key. We need to ensure that our data.json file has this field while inserting into the table else it will complain about missing the key.

create dynamodb table

Step 4: Create Lambda Function

Here, we need to first create an IAM role that has access to CloudWatch Logs, S3, and DynamoDB to interact with these services. Then, we will be writing code using boto3 to do the data download, parse, and save into the customer DynamoDB table. Then, create a trigger that should integrate the S3 bucket with Lambda so that once we push the file in the bucket, it should be picked up by Lambda Function.

Let’s first create an IAM role. IAM Role needs to have at least Read access to S3, write access DynamoDB and Full access CloudWatch Logs service to log every event transaction:

create iam role

Now, create a function. Give a unique name to it and select Python 3.7 as runtime language:

create a lambda function

Now, select the role LambdaS3DyanamoDB we created in earlier step and hit Create function button:

create lambda function

Now follow the below steps for Lambda Function:

  • Write the python code using the boto3 resource API to load the service instance object.
  • An event object is used to pass the metadata of the file (S3 bucket, filename).
  • Then using action methods of s3_client, load S3 file data in the json object.
  • Parse the json data and save it into DyanamoDB table (customer)
#Code snippet

import json
import boto3
dynamodb =  boto3.resource('dynamodb')
s3_client = boto3.client('s3')

table = dynamodb.Table('customer')

def lambda_handler(event, context):
   # Retrieve File Information
   bucket_name =   event['Records'][0]['s3']['bucket']['name']
   s3_file_name =  event['Records'][0]['s3']['object']['key']

   # Load Data in object
   json_object =   s3_client.get_object(Bucket=bucket_name, Key= s3_file_name)
   jsonFileReader  =   json_object['Body'].read()
   jsonDict    =   json.loads(jsonFileReader)

   # Save date in dynamodb table
   table.put_item( Item=jsonDict)

Next, create an S3 trigger:

  • Select the Bucket name created in the earlier step.
  • Select the Event type as “All object create events”.
  • Enter suffix as .json
  • Click on Add button

create s3 trigger

Lambda Function is now ready with all the configurations and setup.

Test the Lambda Function

Let’s test this Lambda function customerupdate.

  • First, we need to upload a json file in the S3 bucket boto3customer.
  • As soon as the file gets uploaded in S3 bucket, it triggers the customerupdate
  • It will execute the code that receives the metadata of the file through the event object and loads this file content using boto3 APIs.
  • Then, it saves the content to the customer table in DynamoDB.

test lambda function

We can see the file content got saved in the DyanamoDB table.

This completes our use case implementation of a serverless application with Python runtime using the boto3 library.

Summary

AWS Lambda is a widely used service for implementing serverless architectures. It supports many languages including NodeJs, Python, and Java. Python as a language has been growing traction among developers. And the boto3 SDK is helping ease the development of functions in Lambda.