AWS Lambda Performance Optimization

  • Topics

How to optimize AWS Lambda performance

A performance-tuned engine representing the under-the-hood optimizations to get the most out of AWS Lambda function performance.

Lambda is a managed AWS service that promises to take care of server infrastructure for you. However, it still requires careful design to get the best performance out of the computation capabilities it provides, and avoid latency and service disruption for users.

This is part of our comprehensive guide to performance testing in a cloud native world.

12 Quick Tips for AWS Lambda Performance Optimization 

Before we get into the details of AWS Lambda Optimization, here are 12 quick tips you can immediately use to improve your performance:

  1. Language – prefer interpreted languages like Node.js or Python over languages like Java and C# if cold start time is affecting user experience. Keep in mind that Java and other compiled languages perform better than interpreted ones for subsequent requests.
  2. Java start time – if you have to go with Java, cold starts will be a bigger problem due to the long startup time of Java applications. se Provisioned Concurrency to address the cold start time issue. 
  3. Framework – prefer Spring Cloud Functions rather than the Spring Boot web framework. 
  4. Network – use the default network environment unless you need a VPC resource with private IP. This is because setting up ENIs takes significant time and adds to the cold start time. However, note that AWS has significantly improved the way ENIs connect to Lambda environments with customer VPCs.
  5. Dependencies – remove all unnecessary dependencies that are not required to run the function. Keep only the ones that are required at runtime.
  6. Variables – use Global/Static variables and singleton objects, because these remain alive until the container goes down. The advantage is that any subsequent call does not need to reinitialize these variables or objects.
  7. Database connections – define DB connections at the global level so that it can be reused for subsequent invocations. AWS has recently released RDS Proxy in preview, which should make applications more scalable and resilient to database failures.
  8. DNS resolution – if your Lambda is running in a VPC and is calling an AWS resource, avoid DNS resolution, because it takes significant time. For example, if your Lambda function accesses an Amazon RDS DB instance in your VPC, launch the instance with the non-publicly accessible option.
  9. Dependency injection – if you are using Java, prefer simpler IoC dependency injections like Dagger, rather than full Spring Framework DI.
  10. JAR files – if you are using Java, put your dependency .jar files in a separate /lib directory, not together with function code. This speeds up the package unpacking process.
  11. Minification – if you are using Node.js, you can use minification and/or uglification of the code to reduce the size of the package. This will reduce the time it takes to download the package significantly. In some cases, package size may reduce from 10MB to 1MB.
  12. SDK library – most blog posts and documentation say that the Lambda execution environment already has AWS SDK for Node.js and Python, and you shouldn’t add them in your dependency. However, this SDK library will be upgraded with the latest patches regularly and may impact your Lambda behavior. So it’s preferable to have your own dependency management.

Setting Up AWS Lambda Performance Monitoring 

In order to optimize Lambda performance, you must have performance monitoring in place.  It is critical to measure and understand the behavior of functions during invocation. These metrics will help to fine-tune configuration and get the best performance out of these functions. 

CloudWatch, by default, logs details of each Lambda execution in LogGroup/LogStream. Using the log details, CloudWatch displays the dashboards for various metrics like “number of requests”, “execution duration”, “error rate” and many more. These metrics can be used to create custom alarms.

CloudWatch only shows details for Lambda functions, but what if we need to know how the downstream AWS services (e.g. DynamoDB, S3) are doing for each lambda invocation? X-Ray can be useful for viewing a function’s execution with downstream services. This helps to track and debug Lambda function issues with other services. 

Note: Adding X-Ray in your Node.js code package adds almost 6MB and will also add to the compute time of your function execution. It is recommended to use it only when it is needed to debug an issue, then remove it. 

Learn more about Lambda monitoring:

AWS Lambda Performance Tuning In-Depth

AWS Lambda Limits – Tuning Memory and CPU

AWS Lambda provides memory ranges from 128 MB to 3,008 MB in 64 MB increments. Although we only specify the RAM, a linearly proportional amount of CPU power gets allocated to the Lambda Function by AWS. When the allocated memory crosses the Lambda memory size limit of 1,792 MB, it adds the equivalent of one full vCPU (one vCPU-second of credits per second).

If you have a single-threaded app, you shouldn’t select more than 1.8 GB RAM, as it cannot make use of the additional CPU and the cost will increase. Conversely, if you have selected less than 1.8 GB RAM and have multi-threading code which is CPU bound, it won’t help in reducing Lambda execution time.

Lambda billing is accurate to 100-ms increments. For that reason, when allocating memory to a function, you need to consider that putting the smallest RAM may reduce the memory cost but increase latency. It may alsoAnd, it may outweigh the cost savings due to the longer execution time of the function.

How can you balance memory and Lambda e time?

As we know, Lambda costing depends on both memory allocation and execution time. If we need to reduce the Lambda execution time, we can try increasing memory (and by extension, CPU) to process it faster. However, when we try to increase the memory for a function past a certain limit, it won’t improve the execution time as AWS currently offers a maximum of 2 cores CPU.

If your application leans more towards computation logic (i.e. it’s CPU-centric), increasing the memory makes sense as it will reduce the execution time drastically and save on cost per execution.

Also, it’s worth paying attention to the fact that AWS charges for Lambda execution in increments of 100ms. So, if the average execution time for your function is 110ms, it will charge you for 200ms. So increasing memory and bringing execution time down to below 100ms can deliver a worthwhile cost saving. 

Cost per invocation is charged at 100-ms increments

There are a few open-source tools available that claim to help you find the best power configuration.  However, monitoring memory usage and execution time – through CloudWatch Logs, X-Ray or a commercial tool like Lumigo – is a better option. You can then adjust configurations accordingly. Increasing or decreasing even a small number of your functions makes a big difference in overall AWS cost.

Learn more in our detailed guide to AWS Lambda Limits


Understanding the Cold Start Time Issue

When we invoke a Lambda function for the first time, it downloads the code from S3, downloads all the dependencies, creates a container and starts the application before it executes the code. This whole duration (except the execution of code) is the cold start time.

Cold start time as a proportion of total execution time

A cold start accounts for a significant amount of total execution time, and can significantly affect the performance and latency of your Lambda functions.

To address this issue, AWS came up with a feature called provisioned concurrency, which can warm up the Lambda execution environments in advance. Environments will be available for applications to do immediate code execution, with no need to wait for functions to start up.

Learn more in our detailed guide to Lambda cold start performance

Reserving Lambda Concurrency to Address Cold Start Performance Problems

AWS Lambda handles scalability for you. It creates a new execution environment to handle concurrent requests. So why should you be concerned? The truth is that nothing comes with infinite resources. Similarly, when optimizing Lambda performance you need to consider concurrency execution limits:

Account-level – By default, it is 1,000 per region across all functions.

Function level – By default, the “Unreserved Account Concurrency limit” option will be selected when we create a new function. That means, it can potentially use all of the available concurrency at account level (1,000 – concurrency used by other functions). However, this is not best practice, as if a function takes up the entire limit of the account, other functions may be impacted by throttling errors. That’s why it is recommended to always configure “reserve concurrency”, supporting a bulkhead pattern.

Setting AWS reserved concurrency

Source: AWS

Note AWS will always keep an unreserved concurrency pool with a minimum of 100 concurrent executions to process the requests of functions that don’t have any specific limit set up. So in practice, you will only be able to allocate up to 900 for reserve concurrency.

When designing concurrency in Lambda, you should always consider the limitations of other integrated services like DynamoDB or RDS. We need to adjust the concurrency limit for a function based on the maximum connection these services can handle.

Concurrency is a good option to handle the large volume of requests, but if there is a sharp spike it will hit the performance of the application, because creating new execution environments entails cold start time and that may cause higher latency in response to requests during that time.

AWS launched Provisioned Concurrency for Lambda at re:Invent 2019 to handle these types of use cases. It provides options to provision the execution environment in advance when creating a function. It can also be auto-scaled based on CloudWatch metrics or scheduled for a particular time or day depending on requirements.

As an official partner for the launch, Lumigo now provides a range of provisioned concurrency metrics and alerts to

Read more in our detailed guide to provisioned concurrency


In this article, we’ve explored various aspects of Lambda function performance. Serverless computing offers numerous advantages to development teams ready to embrace the shift in mindset that it requires, but to get the most out of it we need to ensure the right balance between performance and cost. Following the best practices laid out here will go a long way to achieving that balance, but monitoring is critical to understanding the behavior of your serverless application in order to fine-tune performance.

See Additional Guides on Key Performance Testing Topics

Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of performance testing.

Optimizing Python

Authored by Granulate

Application Performance Monitoring

Authored by Granulate


Authored by EMQX


Optimize Lambda performance

  • Eliminate cold starts
  • Easily handle timeouts 
  • Speed up response times
No code, 5-minute set up
Start Lumigo Free