All Posts

Better Lambda Performance with Lumigo and the Serverless Framework

Lambda is the glue that holds serverless architectures together. Before its release, most users felt it was a matter of luck as to whether AWS would let you connect a service to another. If not, you had to spin up a VM or a container to transform the events from one service in a way that your target service could handle them.

Since Lambda was easier to set up, people assumed that all code they would deploy on it would run faster and cheaper than on other compute services. But the truth is, there’s no free lunch. Lambda functions are easier to set up and manage than VMs or containers, but Lambda is still a professional service that requires understanding to get the most out of it.

The nice thing is, while there are steps you need to take for optimization, they usually follow common patterns you can learn quickly!

This article will cover these optimization steps for an example project built with the Serverless framework. If you simply dumped your code into a Lambda and hoped for the best, here’s an excellent place to start improving.

Cold Start Optimizations

First off, the code examples I use come from this GitHub repository

Now, the first optimizations I’ll be doing are about steps that happen in a cold start. 

When a Lambda function hasn’t been invoked yet because you just deployed it, or when it has been idle for too long, it will go through a cold start. This means the Lambda service will load the code from S3 and spin up a new micro VM to execute your code.

If your functions are invoked often, it will have a regular or warm start. This means the Lambda service doesn’t have to load code or start the VM; it just has to route the next event to the already-loaded function. While a cold start can take many seconds, a warm start can be as short as 20 milliseconds.

Splitting Dependencies

The more code Lambda has to load on a start, the longer the start will take. So, you should split all your files in a way that allows you to import only what a function needs.

Let’s take these two Lambda function handler files as an example:

The do-appsync.js file imports the utils module:

import {appsync, jimp} from "../lib/utils"

export const handler = async () => {

  appsync()

  jimp()

}

The write-batch.js file imports that same module:

import * as DynamoDB from "../lib/DynamoDB"

import {initSecret} from "../lib/utils"

export const handler = async (items) => {

  const secret = await initSecret('top-secret','TOP_SECRET')

  if(secret !== "top-secret message")

    return console.log("bad secret")

  for (const item of items)

    await DynamoDB.createItem(process.env.USERS_TABLE!,item)

}

Now, if we look at the utils.js file, it becomes obvious why this is the case:

import Jimp from 'jimp'

import AWSAppSyncClient from "aws-appsync"

const AWS = require('aws-sdk')

const ssm = new AWS.SSM({region: 'us-east-1'})

export const appsync = () => AWSAppSyncClient.name

export const jimp = () => Jimp.name

export const initSecret =

  async (parameterStoreName, envVarName) => {

    const Parameter = await ssm.getParameter({

      Name: parameterStoreName,

      WithDecryption: true,

    }).promise()

    process.env[envVarName] = Parameter.Parameter?.Value

    return Parameter.Parameter?.Value

  }

Every utility function is defined in this one file, so our Lambda functions can’t import just the code they need—they have to import all of it and then call what they actually want. This gets even worse when you have extensive third-party libraries used by your utility functions.

If you have utility functions that are used in all of your Lambda functions, this approach would be perfectly reasonable; otherwise, you should put every utility function in its own file. 

Create a utils directory instead of a utils.js file, and one file for every utility function.

In our example, we’ll end up with four utility files since one of the utility files in the lib directory already had its own file.

The lib/utils/dynamodb.js file:

import AWS from "aws-sdk"

const DocumentClient =

  new AWS.DynamoDB.DocumentClient({apiVersion: "2012-08-10"})

      export const createItem = async (tableName, item) => {

  await DocumentClient.put({

    TableName: tableName,

    Item: item,

  }).promise();

  return item;

};

The lib/utils/jimp.js file:

import Jimp from "jimp"

export default () => Jimp.name

The lib/utils/appsync.js file:

import AWSAppSyncClient from "aws-appsync"

export default () => AWSAppSyncClient.name

The lib/utils/initSecret.js file:

const AWS = require("aws-sdk")

const ssm = new AWS.SSM({region: "us-east-1"})

export default async (parameterStoreName, envVarName) => {

  const Parameter = await ssm.getParameter({

    Name: parameterStoreName,

    WithDecryption: true,

  }).promise()

  process.env[envVarName] = Parameter.Parameter?.Value

  return Parameter.Parameter?.Value

}

The do-appsync.js function will then look like this:

import appsync from "../lib/utils/appsync"

import jimp from "../lib/utils/jimp"

export const handler = async () => {

    appsync()

    jimp()

}

And the write-batch.js function will look like this:

import * as DynamoDB from "../lib/dynamodb"

import initSecret from "../lib/utils/initSecret"

export const handler = async (items) => {

  const secret = await initSecret('top-secret','TOP_SECRET')

  if(secret !== "top-secret message")

    return console.log("bad secret")

  for (const item of items)

    await DynamoDB.createItem(process.env.USERS_TABLE!,item)
}

Now, the Lambda functions only load the code they need.

Optimizing Imports

Importing a module’s root or index.js file is usually how we start when adding a new package to a project. But for many packages, these files are just redirects to other files. They are suitable for the discoverability of functions but not so good for import size.

You should always check if a package you’re using allows the direct import of submodules, especially if it’s a big package like the AWS SDK.

In our example, the SSM and DynamoDB submodules of the AWS SDK are such a case. We don’t have to import the whole SDK to use them.

So, let’s look at the changes to submodules in our utility files.

The lib/utils/initSecret.js file:

import ssm from "aws-sdk/clients/ssm"

const SSM = new ssm({region: "us-east-1"})

export default async (parameterStoreName, envVarName) => {

  const Parameter = await ssm.getParameter({

    Name: parameterStoreName,

    WithDecryption: true,

  }).promise()

  process.env[envVarName] = Parameter.Parameter?.Value

  return Parameter.Parameter?.Value

}

The lib/utils/dynamodb.js file:

import DynamoDB from "aws-sdk/clients/dynamodb"

const DocumentClient =

  new DynamoDB.DocumentClient({apiVersion: "2012-08-10"})

      export const createItem = async (tableName, item) => {

  await DocumentClient.put({

    TableName: tableName,

    Item: item,

  }).promise()

  return item

}

We don’t need any changes in our Lambda function handlers, but the code they import is now much smaller.

Bundling Code

The next step I’ll show you is bundling up code. While bundling usually isn’t a step you would take on a backend, it can help with Lambda functions. Again, on a cold start, the code is loaded into the VM by the Lambda service. The less code Lambda needs to load, the quicker the cold start. 

The Serverless framework can help here because it comes with an integrated bundler. You just need to add a few lines in the serverless.yml file at the top-right under service: lambdaborghini:

package:

  individually: true

plugins:

  - serverless-bundle

This will build a single individual code bundle for every function you deploy. 

The bundling is something you might know from frontend JavaScript deployments. Here, variable and function names are trimmed down, and things like comments are removed from the code. While this might not give you as many gains as the last two optimization steps, it can still shave off a few hundred milliseconds from a cold start.

General Optimizations

The next optimizations will affect all invocations of your Lambda functions, not just the cold start ones.

Investigating function invocation in the Lumigo Live-Tail timeline can help to find the functions that are long-running and invoked often. If you’re on a tight budget, these functions are the first ones you should start optimizing, and it might even show you which of the following techniques are most appropriate to use.

Moving Init Code Outside of the Function Body

If you have connections or resources your function can reuse in all of its invocations, it’s a good idea to move them outside of the handler function’s body. That way, it will only be executed once at cold start and cached for all following invocations.

The write-batch.js function will look like this:

import * as DynamoDB from "../lib/dynamodb"

import initSecret from "../lib/utils/initSecret"

const promise = initSecret("top-secret", "TOP_SECRET")

export const handler = async (items) => {

  return promise.then((secret) => {

    if(secret !== "top-secret message")

      return console.log("bad secret")

    for (const item of items)

      await DynamoDB.createItem(process.env.USERS_TABLE!, item)

  })

}

Here, the secret is only initialized once outside of the function body and then reused for all following invocations.

Sending Requests in Parallel

If your Lambda function sends multiple requests that don’t depend on each other’s response, you can send them in parallel. While the JavaScript code itself is single-threaded, everything that leaves the runtime can use multiple threads. When the requests come back, the results automatically consolidate into the single thread of your code. Promises are your friends here.

The write-batch.js function will look like this:

import * as DynamoDB from "../lib/dynamodb"

import initSecret from "../lib/utils/initSecret"

const promise = initSecret("top-secret", "TOP_SECRET")

export const handler = async (items) => {

  return promise.then((secret) => {

    if (secret !== "top-secret message")

      return console.log("bad secret")

    const itemPromises = items.map((i) =>

      DynamoDB.createItem(process.env.USERS_TABLE!, i)

    )

    await Promise.all(itemPromises)

  })

}

The main idea here is that I don’t want to wait for every single request before sending the next one, but I do still want to wait until all requests have finished.

First, we map over all items and call the createItem method of DynamoDB. But instead of using it with await, we store each of the promises the method returns into the itemPromises array. This way, the JavaScript engine won’t await the success of a request before moving on to the next.

Later, we use Promise.all. This resolves an array of promises, after which a single promise is sent back. We await that single promise so that the Lambda function keeps running until every request is finished.

Batching Items

If you have to send multiple items to a service, check if it supports batch processing. This way, you can send multiple items at once and eliminate excessive round-trip times.

In our example, the utils/dynamodb module exports more than just the createItem function. It also comes with a batchWrite function that takes an array of items. 

If we use the batchWrite function, the write-batch.js function will look like this:

import * as DynamoDB from "../lib/dynamodb"

import initSecret from "../lib/utils/initSecret"

const promise = initSecret("top-secret", "TOP_SECRET")

export const handler = async (items) => {

  return promise.then((secret) => {

    if (secret !== "top-secret message")

      return console.log("bad secret")

    await DynamoDB.batchWrite(process.env.USERS_TABLE!, items)

  })

}

With this change, we only need to send one request to the DynamoDB service that gets split up for every item on the server side. 

Summary

Writing fast Lambda functions isn’t hard, but you have to be familiar with some tricks required. Each of these optimizations can be applied right after you get your function to do its intended task. It also pays off to dive a bit into Serverless framework configurations—its bundling capability can shave off the last milliseconds.

And finally, to find problematic Lambda functions, you can use Lumigo’s observability features. These give you insights into your system to show you the Lambda functions used most so that you can focus on refactoring those functions that will profit the most from the effort.

This may also interest you