BLOG
< Back Efi Merdler-Kravitz | Jan 07 2020

How to optimize AWS Lambda performance


A performance-tuned engine representing the under-the-hood optimizations to get the most out of AWS Lambda function performance.

AWS Lambda has become the most widely used deployment pattern for serverless applications. It allows developers to set aside worrying about server provisioning, maintenance, idle capacity management and scaling, and instead to focus solely on writing business logic. 

But that’s not entirely true. Because while Lambda is a self-managed AWS service, it still requires careful design to get the best performance out of the computation capabilities it provides. 

In this article, we are going to talk about various parameters to be considered for AWS Lambda performance tuning.  

E-Book - The Defintive Guide to Serverless cost

Lambda computing Resources – Memory & CPU

AWS Lambda provides memory ranges from 128 MB to 3,008 MB in 64 MB increments. Although we only specify the RAM, a linearly proportional amount of CPU power gets allocated to the Lambda Function by AWS. When the allocated memory crosses the limit of 1,792 MB, it adds the equivalent of one full vCPU (one vCPU-second of credits per second).

If you have a single threaded app, you shouldn’t select more than 1.8 GB RAM, as it cannot make use of the additional CPU and the cost will increase. Conversely, if you have selected less than 1.8 GB RAM and have multi-threading code which is CPU bound, it won’t help in reducing execution time.

Lambda billing is accurate to 100-ms increments. For that reason, when allocating memory to a function, you need to consider that putting the smallest RAM may reduce the memory cost but increase latency. And, it may outweigh the cost savings due to the longer execution time of the function.

Balancing Memory and Execution Time

As we know, Lambda costing depends on both memory allocation and execution time. If we need to reduce the Lambda execution time, we can try increasing memory (and by extension, CPU) to process it faster. However, when we try to increase the memory for a function past a certain limit, it won’t improve the execution time as AWS currently offers a maximum of 2 cores CPU.

If your application leans more towards computation logic (i.e. it’s CPU-centric), increasing the memory makes sense as it will reduce the execution time drastically and save on cost per execution.

Also, it’s worth paying attention to the fact that AWS charges for Lambda execution in increments of 100ms. So, if average execution time for your function is 110ms, it will charge you for 200ms. So increasing memory and bringing execution time down to below 100ms can deliver a worthwhile cost saving. 

Cost per invocation is charged at 100-ms increments

There are a few open source tools available that claim to help you find the best power configuration.  However, monitoring memory usage and execution time – through CloudWatch Logs, X-Ray or a commercial tool like Lumigo – is a better option. You can then adjust configurations accordingly. Increasing or decreasing even a small number of your functions makes a big difference in overall AWS cost.

E-Book: Learn best practices for monitoring serverless applications 

Setting Reserve Concurrency Limits

AWS Lambda handles scalability for you. It creates a new execution environment to handle concurrent requests. So why should you be concerned? The truth is that nothing comes with infinite resources. Similarly, when optimizing Lambda performance you need to consider concurrency execution limits:

Account level  – By default, it is 1,000 per region across all functions.

Function level – By default, the “Unreserved Account Concurrency limit” option will be selected when we create a new function. That means, it can potentially use all of the available concurrency at account level (1,000 – concurrency used by other functions). However, this is not best practice, as if a function takes up the entire limit of the account, other functions may be impacted by throttling errors. That’s why it is recommended to always configure “reserve concurrency”, supporting a bulkhead pattern

Setting AWS reserved concurrency

Source: AWS

 

Note – AWS will always keep an unreserved concurrency pool with a minimum of 100 concurrent executions to process the requests of functions that don’t have any specific limit set up. So, in practice you will only be able to allocate up to 900 for reserve concurrency.

When designing concurrency in lambda, you should always consider the limitation of other integrated services like DynamoDB or RDS. We need to adjust the concurrency limit for a function based on the maximum connection these services can handle.

Concurrency is a good option to handle the large volume of requests, however, if there is a sharp spike, it will hit the performance of the application, because creating new execution environments entails cold start time and that may cause higher latency in response to requests during that time.

AWS launched Provisioned Concurrency for Lambda at re:Invent 2019 to handle these types of use cases. It provides options to provision the execution environment in advance when creating a function. It can also be auto-scaled based on CloudWatch metrics or scheduled for a particular time or day depending on requirements.

As an official partner for the launch, Lumigo now provides a range of provisioned concurrency metrics and alerts to help identify problematic cold starts and provision concurrency where it’s needed.

Cold Start Considerations

When we invoke a Lambda function for the first time, it downloads the code from S3, downloads all the dependencies, creates a container and starts the application before it executes the code. This whole duration (except the execution of code) is the cold start time

Cold start time as a proportion of total execution time

A cold start accounts for a significant amount of total execution time, so AWS have been working to address that issue. AWS came up with provisioned concurrency (see above), that can warm up the Lambda execution environments while creating function. Environments will be available for applications to do immediate code execution without expending time for the cold start.

Snow flakes representing cold starts in AWS Lambda

12 Quick Tips for AWS Lambda Optimization 

1. Choose interpreted languages like Node.js or Python over languages like Java and C# if cold start time is affecting user experience.

2. Keep in mind that Java and other compiled languages perform better than interpreted ones for subsequent requests.

3. If you have to go with Java, you can use Provisioned Concurrency to address the cold start time issue. 

4. Go for Spring Cloud Functions rather than Spring Boot web framework. 

5. Use the default network environment unless you need a VPC resource with private IP. This is because setting up ENIs takes significant time and adds to the cold start time. It’s worth noting, though, that a September 2019 release saw AWS significantly improve the way ENIs are assigned to connect Lambda execution environments with customer VPCs using the Hyperplane platform.

6. Remove all unnecessary dependencies that are not required to run the function. Keep only the ones that are required at runtime.

7. Use Global/Static variables and singleton objects as these remain alive until the container goes down. So any subsequent call does not need to reinitialize these variables/objects.

8. Define your DB connections at the Global level so that it can be reused for subsequent invocations. AWS has recently released RDS Proxy in preview, which should make applications more scalable and resilient to database failures.

9. If your Lambda in a VPC is calling an AWS resource, avoid DNS resolution as it takes significant time. For example, if your Lambda function accesses an Amazon RDS DB instance in your VPC, launch the instance with the non-publicly accessible option.

10. If you are using Java, it is preferable to use simpler IoC dependency injections like Dagger rather than Spring-Framework.

11. If you are using Java, it is recommended to put your dependency .jar files in separate /lib directory rather than putting them along with function code. It speeds up the package  unpacking process.

12. If you are using Node.js, you can use minification and/or uglification of the code to reduce the size of the package. This will reduce the time it takes to download the package significantly. In some cases, package size may reduce from 10MB to 1MB as well.

Minification removes all the spaces/newline characters and comments.

Uglification  takes all the variables and obfuscates/simplifies them.

var organizationname = “xyz”

var bigArray = [1,2,3,4,5,6]

//write some code

for(var index = 0; index < 6; index++){

 console.log(bigArray[index]);

}

After minification:

var organizationname = “xyz”, bigArray = [1,2,3,4,5,6]

for(var index = 0; index < 6; index++) console.log(bigArray[index]);

 

After uglification:

for(var o=”myname”,a=[1,2,3,4,5,6],e=0;e<6;e++)console.log(a[e])

13. You will notice that most blog posts and documentation mention that the Lambda execution environment already has AWS SDK for Node.js and Python, and that you shouldn’t add them in your dependency. It will improve performance, but there’s a catch. This SDK library will be upgraded with the latest patches regularly and may impact your Lambda behavior so it’s preferable to have your own dependency management.

Monitor your entire AWS serverless environment from a single view. Learn more

Monitoring 

By now, it should be clear that Lambda function performance depends on a variety of criteria. So, Lambda performance monitoring is of utmost importance, in order to measure and understand the behavior of functions during invocation. These metrics will help to fine-tune configuration and get the best performance out of these functions. 

CloudWatch, by default, logs details of each Lambda execution in LogGroup/LogStream. Using the log details, CloudWatch displays the dashboards for various metrics like “number of requests”, “execution duration”, “error rate” and many more. These metrics can be used to create custom alarms.

CloudWatch only shows details for Lambda functions, but what if we need to know how the downstream AWS services (e.g. DynamoDB, S3) are doing for each lambda invocation? X-Ray can be useful for viewing a function’s execution with downstream services. This helps to track and debug Lambda function issues with other services. 

Note: Adding X-Ray in your Node.js code package adds almost 6MB and will also add to the compute time of your function execution. It is recommended to use it only when it is needed to debug an issue, then remove it. 

Summary

In this article, we’ve explored various aspects of Lambda function performance. Serverless computing offers numerous advantages to development teams ready to embrace the shift in mindset that it requires, but to get the most out of it we need to ensure the right balance between performance and cost. Following the best practices laid out here will go a long way to achieving that balance, but monitoring is critical to understanding the behavior of your serverless application in order to fine-tune performance. 
Run serverless with confidence! Start monitoring with Lumigo today - try it free

Facebook LinkedIn Email
 
Registration request was sent successfully
GET ALL THE LATEST
NEWS & OPINION