Performance Testing in a Cloud Native World

  • Topics

What is Performance Testing?

Performance testing is the process of evaluating computer systems to see how quickly they respond to user requests. Performance testing can uncover issues that negatively impact the user experience, and provide insights on how to fix them.

Traditionally, performance testing focused on client-server systems deployed on-premises. Its goal was to build servers that could withstand peaks in application load and still deliver satisfactory performance.

In today’s cloud native world, performance testing has many new meanings—organizations are testing the performance of cloud computing systems, serverless applications, containerized architectures, and web applications.

In this article, you will learn:

Types of Performance Testing

While the computing world has changed dramatically over the past two decades, the same types of performance tests are still run today in cloud native environments.


Test Type Description
A baseline test is a performance test that runs a system under normal expected load.

This test provides a benchmark to help you identify performance issues under irregular conditions. Also, if you run the test when a service is new, you can get rid of any obvious errors and start assessing real performance.

The traditional load test evaluates a system to see how it performs with higher than normal load.
A stress test is a test that pushes your system to its limits, checking when it crashes, and whether it does so gracefully.

During testing, load on the system is gradually increased until it reaches the point of failure. This is sometimes referred to as a “stepped” test because the increase in load is done gradually.

A spike test evaluates a system with normal load, and a sudden jump to a peak level traffic (for example, jumping from 1,000 to 10,000 concurrent users). Many systems can crash due to sudden spikes in load. This test shows how your system reacts to spikes in traffic or transaction volume.
A soak test (or durability test) is performed under conditions that can have a cumulative effect on system performance.

The test reveals performance problems that can occur due to long-term stress on the system—such as memory leaks, resource leaks, data corruption and other factors that can degrade performance over time.

The industry standard for soak testing runs at 80% of maximum capacity.

Volume Test A volume test is similar to a stress test or load test, but instead of testing application loads, it tests the amount of data being processed, which can also have an impact on application performance.

The test can have two variants: either data is incrementally added to the databases, or a large amount of data is populated before the test, and then the system is evaluated.

In general, normal use of an application constantly increases database size, so you should perform volume testing on a regular basis.

Performance Testing vs. Real User Monitoring

Performance testing and real user Monitoring (RUM) are two techniques for evaluating software application performance. Performance testing is a proactive approach that simulates user behavior and load scenarios in a controlled environment before the software is released. This type of testing is essential for identifying potential performance issues, ensuring that the application can handle expected loads and that any potential bottlenecks are addressed.

RUM offers a reactive analysis, capturing and assessing how actual users interact with the application in real-world conditions. This method is invaluable for understanding the application’s performance across different devices, networks, and locations, providing a comprehensive view of user experience post-deployment. RUM allows developers to detect issues that may not have been evident during the initial testing phases, including how real-world variables such as network latency and device performance impact user satisfaction.

The two methodologies serve complementary roles. Performance testing helps in preparing the application for the market by identifying and resolving issues under simulated conditions, while RUM provides ongoing insights into the application’s performance in the hands of real users, offering data-driven feedback that can guide future optimizations.

Performance Testing Process for a Cloud Native Environment

Performance testing is an old discipline, and many of the old principles are still relevant today. Here is a performance testing process that can be used to test modern cloud native environments, such as cloud systems, serverless, and containerized applications.

  1. Create similar environments for dev, test, and production—in line with DevOps principles, the system under test should be as similar as possible, whether it runs on a developer’s laptop, in a testing environment or in production. This will make performance testing easier and more consistent.
  2. Identify test environments and tools—determine which tools are available for testing, preferring tools that can be used across development, testing, and production environments. Prefer tools that can run in a cloud native environment and support your workloads—and at the same time, are not limited to monitoring one cloud environment.
  3. Define key performance indicators (KPIs) and performance goals—determine what are the key metrics for your application’s success. These may include page load time, latency, or throughput. Consult with stakeholders in your organization to understand what levels of these KPIs will make users happy, and which levels are unacceptable.
  4. Build automated tests—create tests that simulate real activity and test your application, both under normal and unexpected conditions. Tests should be lightweight, repeatable, and able to run as part of CI/CD workflows. Basic performance “smoke tests” can run as part of every build, while more extensive tests can run during acceptance testing phases.
  5. Prepare the test environment—in a DevOps pipeline, test environments are created automatically. Collaborate with DevOps engineers to ensure that the test environment is suitable for performance testing, and has the necessary tools to run realistic tests. Understand the cost impact of performance testing, which may require scaling up systems to accommodate higher loads.
  6. Run performance tests during every iteration—make sure performance testing is an integral part of every build and every release of your system.
  7. Monitor results and tune the application—analyze test results in every iteration, identify bottlenecks and suggest changes that can improve performance. These can be part of the plan for the next development iteration.

Performance Testing in the New IT Environment: Cloud, Serverless, Containers and Apps

In the rest of this article, we’ll explain how to monitor and improve performance in modern computing environments:

  • Cloud performance testing—improving performance in public and private clouds, including AWS, Azure and Google cloud.
  • Serverless performance testingunderstanding serverless monitoring challenges, tuning AWS Lambda functions, and debugging serverless applications
  • Kubernetes and container performance testing—key performance metrics for containerized applications, and how to monitor Kubernetes clusters
  • Web and mobile performance testing—how to ensure that modern web and mobile applications provide a great experience for users

Cloud Performance Testing

Cloud computing is changing the way end users deploy, monitor, and use applications. The cloud provides an unlimited pool of resources for computing, storage and network resources, so you can scale applications as needed.

However, even though cloud applications are able to dynamically scale and adapt to load, it is more important than ever to measure their performance. Performance issues can be complex to detect and resolve in a cloud environment, and can have a major impact on users.

In the cloud, performance testing focuses on changing the number of concurrent users accessing the application, trying different load profiles, and measuring different performance indicators, such as throughput and latency. Testing can be done at the virtual machine level, at a service level, and at an entire application level.

The table below shows performance testing types common used in a cloud environment

Type of Performance Test How it is Used in a Cloud Environment
Stress test Checks the responsiveness and stability of cloud infrastructure under very high loads.
Load test Verifies that the system performs well when used by a normal number of concurrent users.
Browser testing Ensures compatibility and performance of user-facing systems across browsers and devices.
Latency testing Measures the time it takes to receive a response to a request made to a cloud-based service or API.
Targeted infrastructure test Isolates each component or layer of your application and tests if it can deliver the required performance.
Failover test Checks the system’s ability to replace a failed component under high load conditions with minimal service disruption.
Capacity test Defines a benchmark of the maximum traffic or load a cloud system can effectively handle at a given scale.
Soak test Measure the performance of a cloud system under a given load over a long period of time, as a realistic test of a production environment.

AWS Monitoring

Amazon Web Services (AWS) is the most popular public cloud platform. While there are hundreds of AWS services, some of the most common metrics monitored on AWS are:

  • EC2 instance CPU and disk utilization
  • Application logs
  • Load balancer logs
  • Virtual Price Cloud (VPC) flow logs

AWS provides several monitoring services, including:

  • AWS CloudTrail—tracks activity of users, services, and APIs usage across all AWS services. CloudTrail stores event logs of any action performed in an AWS user account, providing visibility into resource activity.
  • AWS CloudWatch—provides dashboards and analytics capabilities including anomaly detection, incident response automation, support for troubleshooting, and operational data.

Learn more in the detailed guide to AWS monitoring.

Azure Monitoring

Microsoft Azure is another market-leading public cloud offering. Common performance metrics on Azure include:

  • Availability and response rate of Azure services
  • Network performance
  • Storage capacity and scalability on Azure storage services
  • Processing capacity on Azure VMs, including availability, utilization and responsiveness

Azure provides a number of first-party monitoring services:

  • Azure Monitor—collects performance metrics, diagnostic and activity logs from all Azure services, and provides dashboards to visualize the data.
  • Azure Advisor—scans cloud configurations and provides recommendations for improving availability, performance, security, and cost.
  • Service Health—checks for critical maintenance issues on customer applications and services.
  • Network Watcher—monitors network performance for Virtual Networks (VNet), virtual machines, and application gateways.
  • Resource Health—diagnoses and provides support for problems in Azure services.

Learn more in the detailed guide to Azure monitoring.

Google Cloud Monitoring

Google Cloud Platform is Google’s public cloud service. It offers Google Cloud Monitoring, a service that provides data about the performance, availability, and health of your cloud applications.

Google Cloud Monitoring is based on three foundations:

  • Metrics—the Google Cloud Console and Cloud Monitoring API gives you access to over 1,500 cloud monitoring metrics (see the full metrics list). You can also create your own custom metrics.
  • Alerts—inform your team about problems with cloud applications. Alerts can be managed via Google Cloud Console, Cloud Monitoring API, and Cloud SDK.
  • Dashboards—you can create dashboards to visualize Google Cloud metrics using the Google Cloud Console, or programmatically via the dashboards API endpoint.

Serverless Performance

Serverless computing is a way to design and deliver software with no knowledge of the underlying infrastructure—computing resources are provided as an automated, scalable cloud service.

In a traditional data center, or when using an infrastructure as a service (IaaS) model in the cloud, the server’s computing resources are fixed, and paid for regardless of the amount of computing work the server performs. In serverless computing, billing is performed only when the client’s code is actually running.

Serverless computing does not eliminate servers, but its purpose is to remove computing resource considerations from the software design and development process.

However, the serverless model raises significant challenges with regard to performance testing and application debugging:

  • No access to the server or operating system, making it difficult to identify and diagnose performance issues
  • A highly dynamic environment, with short-lived serverless functions running for minutes or seconds
  • A highly distributed environment with complex dependencies
  • Concurrency and throttling are handled by the cloud provider, and can have a major impact on end users, while they are difficult to predict

The main impact of these challenges is low observability over what is running and how workloads are performing.

AWS Lambda Performance Tuning

Amazon Lambda is the most popular serverless platform today, with 96% market share according to the BMC State of Serverless report. Therefore, understanding how to diagnose and tune performance issues in Lambda is top of mind for any serverless practitioner.

We’ll briefly cover three ways to improve performance in AWS Lambda.

  1. Memory and CPU utilization

AWS Lambda allocates memory to serverless functions, ranging from 128 MB to 3,008 MB in 64 MB increments. At the same time, CPU power is allocated to the function in direct proportion to the amount of memory. When the allocated memory exceeds 1,792 MB, Lambda gives the function access to one full vCPU of processing capacity.

If your application is single-threaded, make sure never to use more than 1.8 GB of RAM, because when additional vCPUs are added, you will pay more but the app will not be able to make use of them. Conversely, if you use less than 1.8 GB of RAM, and the application is multi-threaded, you will not be able to make use of multi threading to improve performance.

  1. Balance between memory and execution time

AWS Lambda cost depend on memory allocation and execution time. If you reduce Lambda execution time (for example by making your function more efficient), you can increase memory (and thus increase CPU) to speed up processing. However, AWS currently offers up to 2 vCPU cores per function, so beyond a certain point, increasing memory will not reduce execution time.

If your application is CPU intensive, increasing memory can significantly reduce execution time and cost per execution.

  1. Set a reserved concurrency limit

AWS Lambda handles scalability automatically. But there are limits to the number of concurrent requests it allows. When optimizing Lambda performance, you should consider limiting concurrent executions.

There are two ways to limit concurrency in AWS Lambda:

  • Account level—by default, Lambda accounts allow 1,000 concurrent functions to run
  • Function level—by default, any function can use the full account-level limit of 1,000 concurrent sessions. However, you should limit each of them to a reasonable amount of your total available concurrency. Otherwise other functions will receive throttling errors.
  1. Manage cold starts

A unique problem in serverless environments is cold starts—when a function is invoked, it takes time for it to power up and start serving user requests, and in the interim, users may experience latency.

AWS has introduced the concept of provisioned concurrency, which can help resolve this problem. It lets you provision a preset amount concurrency in Lambda, ensuring that you have an appropriate amount of function instances running, ready to serve user requests with no delay. You can change provisioned concurrency automatically using CloudWatch metrics, or by defining a schedule based on known application loads.

Learn more in our detailed guide to AWS Lambda performance

Serverless Monitoring

Serverless systems are a “black box”—engineers do not have knowledge about their inner workings. Serverless function code is only executed during the request, and it is not known where exactly it is executed on the hardware.

However, it is possible to monitor and resolve issues in serverless, it just requires looking at different metrics than you are used to in a traditional server-based environment.

What Should We Be Monitoring?

Push data

This is the traditional style of monitoring, where an agent running on a server collects data and saves it to a central server. In serverless, you don’t have access to the server, so monitoring should be built into the serverless function itself. This means that the developer has to include the monitoring library as part of their code, and initialize it while the process is running.

Push data monitoring in a serverless environment can help you collect data like function execution time, memory usage, recording user experience, function payloads, network performance and database performance.

Pull data

Pull data monitoring is a new type of monitoring, in which services are built to report metrics on their own. Because services are temporary and not running statically, it can be effective to build telemetry functions into the service, and have it report essential metrics. This requires careful planning, because there can be multiple services running in diverse locations at different times.

4 Key Serverless Monitoring Metrics

Here are some of the key metrics you should measure for serverless functions:

  1. Calls—the number of times function code was executed, including successful executions and executions with functional errors.
  2. Errors—number of times a function did not execute successfully. Errors include exceptions thrown in your code, and exceptions thrown by the serverless platform, mainly related to timeouts and configuration errors.
  3. Throttle—number of functions that did not run due to throttling. Serverless platforms impose a limit on concurrency. If the maximum number of functions allowed are already running, additional invocations are not allowed, until there is available capacity. This is different from an error because the function failed to run due to a resource limitation.
  4. Duration—the time it takes for function code to process the event. For the first event handled by the function instance, this includes the initialization time.

Learn more in our detailed guide to serverless monitoring

Serverless Debugging

Serverless applications are highly fragmented. You will not always have the ability to run these components locally. Distributed architecture can cause problems in many areas of the stack. The ability to drill down into code is critical to quickly fixing defects.

Remote debugging is difficult in serverless applications, because you do not have access to the server and operating system. It can also incur costs, because you’ll need to spin up instances of a function in order to test them. Developers often don’t understand where the problem is and why it happened.

In order to effectively debug serverless applications, it is essential to have dedicated tools that can provide information about what is happening in the environment and where the problems lie. In particular, it is important to get access to stack traces of serverless functions that incurred errors, to be able to debug and resolve production issues. Serverless observability tools have been developed to address these challenges.

Learn more in our detailed guide to serverless debugging

Serverless Observability Tools

The following tools can help you monitor and debug serverless functions in the most popular serverless platform, AWS Lambda.

AWS Cloudwatch

The primary source of information about how AWS applications work is AWS CloudWatch Logs. This also applies to Lambda functions. By default, Lambda functions send data to CloudWatch, and CloudWatch creates a LogGroup for each function. Each LogGroup has multiple log streams, which contain the logs generated for each Lambda function instance.

Log streams contain log events. Click on an event to see more information about the specific Lambda function that was called. This is the most basic way to debug issues in a Lambda function.

AWS X-ray

CloudWatch log events are useful, but limited in the information they provide about serverless issues. AWS built X-Ray to answer more complex questions like:

  • Which production issues are resulting in performance problems
  • Which specific integrations are resulting in application issues

X-Ray creates a mapping of application components, and enables end-to-end tracking of requests. It can be used to debug applications both during application development and in production.


The Lumigo platform is the leading monitoring and debugging platform for serverless and microservices applications. It deploys in minutes, with no code changes, and enables:

  • Removing performance bottlenecks—see which services run sequentially or in parallel, and automatically identify your worst latency offenders.
  • One-click distributed tracing—lets developers effortlessly find and fix issues in serverless and microservices environments.
  • Visual debugging—virtual stack trace of all services participating in the transaction. Everything is displayed in a visual map that can be searched and filtered.
  • Serverless-specific alerts—predictive analytics identifies and alerts on issues before they impact application performance or costs.

Learn more about the Lumigo platform and get started free!

Kubernetes and Container Performance

Containers are a crucial part of the cloud native environment, and a foundation of most DevOps environments. They are a lightweight encapsulation of software and configuration, which makes it easy to deploy applications and IT systems in an automated and repeatable manner. Docker is the de-facto standard for container engines, and Kubernetes is the most popular orchestration tool used to manage large numbers of containers.

According to the 2020 CNCF Survey, 92% of cloud native users run containers in production, and 83% of them use Kubernetes in production. 23% of organizations have over 5,000 containers, and 12% say they have over 50 Kubernetes clusters running in production—indicating growing enterprise use of containers. In production and large enterprise deployments, performance considerations become critical.

Container Performance Considerations

One of the important things to understand is that containers are not virtual machines. A virtual machine runs as a software representation of a computer that is independent of the physical host, but containers depend on the host’s operating system, kernel and file system.

This means that, for example, a virtual machine can be hosted on a computer running Windows, while its workloads run Linux, or vice versa. On the other hand, containers running on the default Linux host must run Linux, because they share the resources of the underlying operating system kernel.

A container running on a host appears to be completely isolated from other containers. But internally, all containers running on a particular host use that host’s kernel and file system. This shared usage of host resources has a profound effect on performance optimization—which only gets more complicated when you run orchestrators like Kubernetes.

Learn more in our detailed guide to containerized architecture

Key Kubernetes Performance Metrics

Broadly speaking, Kubernetes runs two types of containers:

  • Master nodes—running infrastructure components such as cluster configuration and management
  • Worker nodes—these are containers running your workloads

Containers are organized into pods, which are deployed on physical hosts called nodes. For both types of containers, the following three metrics are crucial for evaluating performance as part of a Kubernetes deployment.

1. Memory Utilization

Monitoring memory usage at the pod and node level can provide valuable insight into cluster performance and the ability to successfully run workloads. Pods whose physical memory usage exceeds the predefined limit will be shut down. Also, if a node is running out of available memory, the kubelet marks the node as out of memory and starts reclaiming its resources.

2. Disk Utilization

Like memory, disk space is a critical resource for each container. So if kubelet detects that the root volume is running out of disk space, pod scheduling issues can occur. On specific nodes, when available disk space goes below a certain threshold, the node is flagged as having “disk pressure”, and kubelet may also reclaim node level resources.

You also need to track the usage level of storage volumes used by Kubernetes pods. Storage volumes provide persistent storage, which survives even after a specific container or pod shuts down. This allows you to predict problems at the application or service level.

3. CPU Utilization

Tracking the number of CPUs used by pods and nodes, compared to the configured requirements and limits, can provide valuable insights into cluster performance. If there are insufficient CPU resources available at the node level, the node will throttle the CPU resources available to each pod, which can cause performance issues.

Kubernetes Performance Best Practices

Here are a few best practices you can use to improve performance of applications running on Kubernetes:

  • Deploy clusters close to customers—the geographic location of Kubernetes nodes has a major effect on latency experienced by clients. Leverage cloud resources to move clusters physically near to end users.
  • Carefully select persistent storage resources—on storage devices defined as Kubernetes storage volumes, prefer SSD drives, and if possible, use NVMe for heavy workloads. When using cloud services, opt for premium service tiers that provide better performance and higher IOPS.
  • Optimized your images—Kubernetes runs large quantities of the same images. It is critical to optimize images to ensure they are lightweight and run quickly and effectively. Any redundant code or software in an image can be multiplied thousands of times in a Kubernetes environment.
  • Optimize the etcd cluster—etcd is a critical component that stores configuration for Kubernetes clusters. Monitor etcd clusters carefully to ensure they are healthy and have sufficient resources. etcd should be colocated, or at least connected with a fast network connection, to the kub-apiserver that serves requests from pods.

Learn more in the detailed guide to Kubernetes in production

Kubernetes Monitoring

Here are a few best practices and considerations for monitoring Kubernetes deployments. You should establish careful monitoring at both the cluster and pod level.

Kubernetes Cluster Monitoring

The purpose of cluster monitoring is to monitor the health of the entire Kubernetes cluster. As an administrator, you need to know if all nodes in the cluster are functioning normally, the workload capacity they are running, the number of applications running on each node, and the resource utilization of the entire cluster.

  • Node resource utilization—important metrics include network bandwidth, disk usage, CPU and memory usage. You can use these metrics to determine whether to increase or decrease the number and size of nodes in the cluster.
  • Number of nodes—the current number of nodes available is an important metric to follow. This will give you an idea of cloud costs of the Kubernetes deployment, and what scale of workloads you can leverage the cluster for.
  • Running pods—the number of running pods indicates whether there are enough nodes available and whether they can handle the entire workload in case of node failures.

Kubernetes Pod Monitoring

You can use Kubernetes metrics to monitor how specific pods and their workloads are behaving. Pay special attention to:

  • The number of instances of each type of pod—if the number is small, the cluster may have run out of resources
  • Deployment progress—transition of instances from old to new
  • Health checks
  • Network data reported through pod network services

Container metrics are primarily provided by the cAdvisor utility which comes with Kubernetes. For more extensive monitoring capabilities, a common choice is Prometheus, an open source monitoring tool built for cloud native environments.

Learn more in the detailed guides to:

Application Performance

Application performance is a broad term that refers to the efficiency and speed of any software application. It’s about how fast the application responds to user requests, how smoothly it runs, and how well it accomplishes its primary task. This is a critical aspect of software development because it directly affects the user experience. If an application is slow or frequently crashes, users will quickly abandon it in favor of a better-performing alternative.

Application performance is not just about speed. It’s a holistic measure of an application’s efficiency, reliability, and overall user satisfaction. It’s a fundamental quality that every developer should strive to achieve in their software.

Learn more in the detailed guide to application performance monitoring

Performance Tuning in Popular Programming Languages

Every programming language has its unique performance issues and optimization techniques. We’ll briefly review performance tuning in a few popular languages: Java, Golang, Python, Javascript, and PHP.

Java Performance

Java is a versatile and widely-used programming language known for its “write once, run anywhere” capability. However, it’s often criticized for its slower performance compared to languages like C or C++. Here are a few ways to improve Java application performance.

Firstly, always use the latest version of Java. Each new version brings performance improvements and optimizations that can significantly speed up your application. Secondly, use appropriate data structures. The right data structure for the right job can greatly enhance performance. For example, using an ArrayList instead of a LinkedList when frequent random access is needed can improve speed.

Additionally, avoid creating unnecessary objects. Object creation and garbage collection can be costly in terms of performance. So, reuse objects when possible and nullify them when they’re no longer needed.

Learn more in the detailed guide to Java performance

Golang Performance

Golang, also known as Go, is lauded for its simplicity and efficiency. It’s built with performance in mind, but there are still ways to optimize your Golang applications.

One way is by understanding and effectively using Goroutines. Goroutines are lightweight threads managed by the Go runtime. They’re cheap to create and can significantly boost performance when used for concurrent tasks.

Another tip is to use built-in functions and packages whenever possible. Go has a rich standard library full of optimized packages, so take full advantage of them. Lastly, minimize garbage collection by reducing heap allocations. This can be achieved by reusing objects and avoiding unnecessary pointer usage.

Learn more in the detailed guide to Golang performance

Python Performance

Python is known for its readability and ease of use, not for its speed. However, with the right practices, Python performance can be significantly improved.

Using built-in functions and libraries is one of the key ways to boost Python performance. These functions are written in C, making them much faster than their Python counterparts. Also, consider using list comprehensions instead of traditional loops for better speed.

Furthermore, consider using a JIT (Just-In-Time) compiler like PyPy. JIT compilers can significantly speed up Python code by compiling it into machine code just before execution.

Learn more in the detailed guide to optimizing Python

Javascript Performance

Javascript is the backbone of modern web development. It’s used in both front-end and back-end development, making its performance crucial. Here are some tips to optimize Javascript performance.

Firstly, use the ‘use strict’ directive. This enables strict mode, which can catch common coding mistakes and “unsafe” actions, thus preventing potential performance hits.

Secondly, minimize DOM manipulations. DOM operations are expensive, so try to batch your changes and update the DOM as few times as possible. Also, avoid using global variables as they can slow down lookups.

PHP Performance

PHP is a popular server-side scripting language. It’s simple to use but needs careful handling to optimize performance. Here are some PHP performance tips.

Firstly, use the latest PHP version. Each new version comes with performance improvements and new features that can boost speed. Secondly, use native PHP functions whenever possible. These functions are faster and more efficient than custom code.

Also, consider using a PHP accelerator like OPcache. PHP accelerators can significantly improve performance by caching precompiled script bytecode, thereby reducing the need for PHP to load and parse scripts with each request.

Web and Mobile Performance Testing

Importance of Web Application Performance Testing

Web application developers today can automatically push their code through build, test, and deploy, but are not always sure how the code will perform in production. Web application performance testing is the solution to this problem, and should be an important part of your testing strategy.

A poorly performing website or web application will also do worse on SEO and subsequently get less traffic than competitors, and is likely to have lower engagement, lower conversion, and lower revenues. Effective website and web application performance testing can improve all these metrics by ensuring the development team pays consistent attention to performance.

Learn more in the detailed guides to:

Website Performance Testing Tools

Here are a few tools web application developers can use to test and improve performance on an ongoing basis.

Pagespeed Insights

Pagespeed Insights by Google checks several on-page and back-end factors of a web page, and reports on their effect on page load time. It provides a performance score for desktop and mobile, shows which elements have the biggest impact on page load time, and provides suggestions for improvement.


GTmetrix, a free tool which is based on the Google Lighthouse performance benchmark, provides five different reports showcasing website performance:

  • PageSpeed
  • YSlow
  • Waterfall breakdown
  • Video showing page load
  • History of page performance

GTmetrix helps visualize page performance and lets you set up alerts to notify about performance issues. It also provides extensive support for testing performance on mobile devices.


Pingdom is a commercial offering that monitors website uptime, page speed, real user monitoring (RUM) showing how actual visitors experience your web pages, and synthetic session monitoring. Pingdom has a global infrastructure with testing servers in 100 countries. It provides notification by email, SMS, and integrates with collaboration and alerting tools like Slack and PagerDuty.


WebPageTest is a free tool that lets you test your web pages from 40 locations, using 25 browsers. It performs in-depth performance tests covering topics like:

  • Bandwidth usage
  • Compression and caching
  • Use of CDN
  • DNS lookups
  • User experience optimization

Cloudinary Website Speed Test

Website Speed Test analyzes a website’s images, which are responsible for a large percentage of load time on most web pages. It inspects image format, fit, compression, and quality options, and provides suggestions for optimizing images, which can have a dramatic impact on page load time. The image analysis tool is integrated with WebPageTest.

Datadog RUM and APM

Datadog Real User Monitoring (RUM) is a commercial tool that offers a view into the real-time activity and experience of web and mobile application users. It offers the following key capabilities:

  • Performance tracking: Tracking metrics like web page load times, user actions, network requests, and the execution of front-end code.
  • Error management: Monitoring bugs and errors over time across different versions, helping teams to prioritize and resolve them efficiently.
  • Dashboards for data analysis: Automatically collecting and presenting data on user journeys, application performance, network requests, and errors.

Learn more in the detailed guide to real user monitoring (RUM)

Datadog’s Application Performance Monitoring (APM) provides visibility into an application’s operations, from front-end user interactions down to backend services and database queries. It provides the following capabilities:

  • Tracing and telemetry correlation: Performs distributed tracing at the code level, capturing data from browser and mobile applications to backend services and databases.
  • Problem identification and resolution: Helps identify the source of issues, allowing teams to identify root causes, with visibility into execution time and resource consumption of individual methods and code lines.
  • Resource monitoring: Monitors resource utilization and supports optimization of resources, which in turn, improves application performance.
  • Security monitoring: Detects code vulnerabilities and potential threats in real time, assisting teams in preemptively addressing them. 

Learn more in the detailed guide to Datadog APM

Website Performance Best Practices: The Basics

Here are several simple best practices that can improve performance for your web pages.

Enable Compression

Use GZip to reduce the size of CSS, HTML and JavaScript files that exceed 150 bytes. However, don’t use GZip for image files. Instead, use image formats that enable compression, and set the appropriate compression ratios to reduce file size while retaining quality.

Minify CSS, JavaScript, and HTML

Optimizing your code by removing spaces, commas and other unwanted characters, as well as comments, formatting, and unused code, can significantly speed up your page. To compress text files even further, use minification and uglification frameworks such as CSS Nano and UglifyJS.

Reduce Redirects

Whenever a page is redirected to another page, the visitor faces a delay waiting for the HTTP request/response cycle to complete. It is not uncommon to see web pages redirected over three times, and each redirect adds a delay for the user. Not to mention redirect loops and errors that can result in a broken user experience.

Improve Server Response Time

Server response time is influenced by the amount of traffic received, the resources used by each page, software running on the server, and the hosting solution used. To speed up server response time, find and fix performance bottlenecks such as delayed database queries, slow routing, and insufficient memory or CPU resources on the server. Aim for a server response time of under 200ms.

Optimize Images

Resize images to the required size before using them on a web page, and provide several versions of images for responsive designs. Ensure image files are compressed for the web. Use CSS sprites to combine images like buttons and icons into one large image—the file then loads immediately with fewer HTTP requests, and the web page shows only the relevant portions. This can significantly reduce page load time.

Learn more in the detailed guide to image optimization

Advanced Website Performance Optimization

Modern Image and Video Formats

Use next-generation image and video formats, which can provide much better compression ratios with higher quality:

  • WebP—created by Google, preserves image quality with a much higher compression rate than traditional formats like JPG. Supports both lossy and lossless compression.
  • JPEG 2000—an updated version of JPEG with enhanced lossless compression. Also supports video.
  • AVIF—an image version of the AV1 video format, providing lossy and lossless compression that produces files 10X smaller than comparative JPEG files.
  • JPEG XL—a new version of JPEG based on the Free Lossy Image Format (FLIF) and Pik format. Compresses image files up to a third of the weight of traditional JPEG.

Learn more in the detailed guide to next-generation image formats

Image and Video Optimization

You can optimize website images and videos by selecting the most appropriate format and compression, and delivering media files efficiently using content delivery networks (CDN).

There are many CMS plugins and tools available that can help automate image and video optimization. These plugins let you automatically convert images to the most appropriate format, and apply the best quality parameters to reduce file size while retaining quality.

Learn more in the detailed guide to video optimization

Lazy-Load Images

Lazy loading is a common and effective technique, which involves only loading images only when the website visitor needs to see them.

Typically, the technique detects the user’s viewport, and loads images as the user scrolls down the page and sees them. This can significantly reduce the amount of data loaded to the user’s browser when they initially visit a page, and conserve bandwidth because images that are never viewed by the user do not need to be downloaded.

Learn more in the detailed guide to lazy loading

See Additional Guides on Key Performance Testing Topics

Lumigo, together with several partner websites, has authored a large repository of content that can help you learn about many aspects of performance testing for cloud native and web applications. Check out the articles below for objective, concise reviews of key data security topics.

Serverless Monitoring

Authored by Lumigo

Learn how to monitor serverless applications in production, making them observable and easy to maintain and troubleshoot.

See top articles in our serverless monitoring guide:

●      How to Monitor Lambda with CloudWatch Metrics

●      Lambda Logs: a Complete Guide

●      Using CloudWatch Logs for Lambda Monitoring

Serverless Debugging

Authored by Lumigo

Learn how to debug serverless applications. Understand the differences between debugging for monolithic and microservices apps, and understand serverless testing.

See top articles in our serverless debugging guide:

●      Serverless Testing: Adapt or Cry

●      AWS Lambda Unit Testing

●      AWS Serverless Timeouts: What are They and How to Fix and Prevent Them

AWS Lambda Performance

Authored by Lumigo

Learn how to optimize AWS Lambda performance, overcoming challenges like short-running functions, timeouts, and cold starts.

See top articles in our AWS Lambda performance guide:

●      AWS Lambda Timeout Best Practices

●      How to Improve AWS Lambda Cold Start Performance

●      Understanding AWS Lambda Concurrency

Kubernetes monitoring

Authored by Lumigo

AWS Monitoring

Authored by NetApp

Learn to monitor workloads on AWS using first-party and third-party tools, and discover best practices for performance and cost optimization.

See top articles in the AWS monitoring guide:

●      Cloudwatch Log Insights: Ultimate Quick Start Guide

●      Monitoring the Costs of Underutilized EBS Volumes

Image Optimization

Authored by Cloudinary

Learn how to optimize images and use compression, quality settings, CDN and other techniques to dramatically reduce page load time.

See top articles in the image optimization guide:

●      PHP Image Compression, Resize & Optimization

●      Three Popular and Efficient Ways to Loading Images

●      Python Image Optimization and Transformation

Image Formats

Authored by Cloudinary

Discover next-generation image formats that can help you improve web performance. New formats like WebP and JPEG-XR deliver high quality with improved compression.

See top articles in our image formats guide: 

●      The Great JPEG 2000 Debate: Analyzing the Pros and Cons to Widespread Adoption

●      Adopting the WebP Image Format for Android on Websites Or Native Apps

●      Optimizing Animated GIFs With Lossy Compression

Optimizing Python

Authored by Granulate 

Video Optimization

Authored by Cloudinary

Learn how to optimize video content for higher performance and improved user experience, by using the latest compression and streaming technology, and automatically adjusting videos to user requirements.

See top articles in our video optimization guide:

●      Tips for Retaining Audience Through Engaging Videos

●      Product Videos 101: What Makes Them Great?

●      Automated Generation of Intelligent Video Previews on Cloudinary’s Dynamic Video Platform

Website performance

Authored by Cloudinary

Lambda Performance

Authored by Lumigo

Application Performance Monitoring

Authored by Granulate

Java Performance

Authored by Granulate

Golang Performance

Authored by Granulate

Azure Monitoring

Authored by NetApp


Authored by EMQX

Prometheus Monitoring

Authored by Tigera

Datadog APM

Authored by Coralogix

Real User Monitoring

Authored by Coralogix

Additional Performance Testing Resources

Below are additional articles that can help you learn about data security topics.

Debug fast and move on.

  • Resolve issues 3x faster
  • Reduce error rate
  • Speed up development
No code, 5-minute set up
Start debugging free