Performance Testing in a Cloud Native World

  • Topics

What is Performance Testing?

Performance testing is the process of evaluating computer systems to see how quickly they respond to user requests. Performance testing can uncover issues that negatively impact the user experience, and provide insights on how to fix them.

Traditionally, performance testing focused on client server systems deployed on-premises. Its goal was to build servers that could withstand peaks in application load and still deliver satisfactory performance.

In today’s cloud native world, performance testing has many new meanings—organizations are testing the performance of cloud computing systems, serverless applications, containerized architectures, and web applications.

In this article, you will learn:

Types of Performance Testing

While the computing world has changed dramatically over the past two decades, the same types of performance tests are still run today in cloud native environments.

Test TypeDescription
A baseline test is a performance test that runs a system under normal expected load.

 

This test provides a benchmark to help you identify performance issues under irregular conditions. Also, if you run the test when a service is new, you can get rid of any obvious errors and start assessing real performance.

The traditional load test evaluates a system to see how it performs with higher than normal load.
A stress test is a test that pushes your system to its limits, checking when it crashes, and whether it does so gracefully.

 

During testing, load on the system is gradually increased until it reaches the point of failure. This is sometimes referred to as a “stepped” test because the increase in load is done gradually.

 

A spike test evaluates a system with normal load, and a sudden jump to a peak level traffic (for example, jumping from 1,000 to 10,000 concurrent users). Many systems can crash due to sudden spikes in load. This test shows how your system reacts to spikes in traffic or transaction volume.
A soak test (or durability test) is performed under conditions that can have a cumulative effect on system performance.

 

The test reveals performance problems that can occur due to long-term stress on the system—such as memory leaks, resource leaks, data corruption and other factors that can degrade performance over time.

 

The industry standard for soak testing runs at 80% of maximum capacity.

Volume Test

A volume test is similar to a stress test or load test, but instead of testing application loads, it tests the amount of data being processed, which can also have an impact on application performance.

 

The test can have two variants: either data is incrementally added to the databases, or a large amount of data is populated before the test, and then the system is evaluated.

 

In general, normal use of an application constantly increases database size, so you should perform volume testing on a regular basis.

Performance Testing Process for a Cloud Native Environment

Performance testing is an old discipline, and many of the old principles are still relevant today. Here is a performance testing process that can be used to test modern cloud native environments, such as cloud systems, serverless, and containerized applications.

  1. Create similar environments for dev, test, and production—in line with DevOps principles, the system under test should be as similar as possible, whether it runs on a developer’s laptop, in a testing environment or in production. This will make performance testing easier and more consistent.
  2. Identify test environments and tools—determine which tools are available for testing, preferring tools that can be used across development, testing, and production environments. Prefer tools that can run in a cloud native environment and support your workloads—and at the same time, are not limited to monitoring one cloud environment.
  3. Define key performance indicators (KPIs) and performance goals—determine what are the key metrics for your application’s success. These may include page load time, latency, or throughput. Consult with stakeholders in your organization to understand what levels of these KPIs will make users happy, and which levels are unacceptable.
  4. Build automated tests—create tests that simulate real activity and test your application, both under normal and unexpected conditions. Tests should be lightweight, repeatable, and able to run as part of CI/CD workflows. Basic performance “smoke tests” can run as part of every build, while more extensive tests can run during acceptance testing phases.
  5. Prepare the test environment—in a DevOps pipeline, test environments are created automatically. Collaborate with DevOps engineers to ensure that the test environment is suitable for performance testing, and has the necessary tools to run realistic tests. Understand the cost impact of performance testing, which may require scaling up systems to accommodate higher loads.
  6. Run performance tests during every iteration—make sure performance testing is an integral part of every build and every release of your system.
  7. Monitor results and tune the application—analyze test results in every iteration, identify bottlenecks and suggest changes that can improve performance. These can be part of the plan for the next development iteration.

Performance Testing in the New IT Environment: Cloud, Serverless, Containers and Apps

In the rest of this article, we’ll explain how to monitor and improve performance in modern computing environments:

  • Cloud performance testing—improving performance in public and private clouds, including AWS, Azure and Google cloud.
  • Serverless performance testingunderstanding serverless monitoring challenges, tuning AWS Lambda functions, and debugging serverless applications
  • Kubernetes and container performance testing—key performance metrics for containerized applications, and how to monitor Kubernetes clusters
  • Web and mobile performance testing—how to ensure that modern web and mobile applications provide a great experience for users

Cloud Performance Testing

Cloud computing is changing the way end users deploy, monitor, and use applications. The cloud provides an unlimited pool of resources for computing, storage and network resources, so you can scale applications as needed.

However, even though cloud applications are able to dynamically scale and adapt to load, it is more important than ever to measure their performance. Performance issues can be complex to detect and resolve in a cloud environment, and can have a major impact on users.

In the cloud, performance testing focuses on changing the number of concurrent users accessing the application, trying different load profiles, and measuring different performance indicators, such as throughput and latency. Testing can be done at the virtual machine level, at a service level, and at an entire application level.

The table below shows performance testing types common used in a cloud environment

Type of Performance TestHow it is Used in a Cloud Environment
Stress testChecks the responsiveness and stability of cloud infrastructure under very high loads.
Load testVerifies that the system performs well when used by a normal number of concurrent users.
Browser testingEnsures compatibility and performance of user-facing systems across browsers and devices.
Latency testingMeasures the time it takes to receive a response to a request made to a cloud-based service or API.
Targeted infrastructure testIsolates each component or layer of your application and tests if it can deliver the required performance.
Failover testChecks the system’s ability to replace a failed component under high load conditions with minimal service disruption.
Capacity testDefines a benchmark of the maximum traffic or load a cloud system can effectively handle at a given scale.
Soak testMeasure the performance of a cloud system under a given load over a long period of time, as a realistic test of a production environment.

Cloud Performance and Storage Tiering

Storage tiering is a commonly-used concept in the cloud. It involves moving data between storage services or storage classes at different stages of its lifecycle. For example:

  1. While workloads are running in the cloud, data is stored in a fast SSD disk drive directly connected to a compute instance.
  2. After the workload finishes working on the data, a snapshot of the disk drive is saved to low cost object storage.
  3. For a certain period, the snapshot is stored in a higher-cost storage tier which enables fast access
  4. After that period, the snapshot is moved to a lower-cost archive storage tier which enables access with higher latency

In the cloud, each storage service or service tier has its own availability and performance criteria and its own SLAs. When monitoring performance, you must be aware of these criteria and monitor each tier according to them. For example, if you experience high latency on an SSD disk or fast-access object storage, it should raise an alert – but if there is high latency when retrieving data from an archive, this is normal.

Learn more in the detailed guide to storage tiering

Cloud Performance and Cost Monitoring

In the cloud, performance and cost are directly related. If a cloud resource has unsatisfactory performance, there are usually several ways to improve performance by provisioning more resources:

  • If a compute instance is underperforming, you can change it to a more powerful instance size, or scale out by adding more compute instances
  • If storage is underperforming, you can switch to a higher service tier or a faster service – for example, cloud file storage services have premium tiers supporting very low latency
  • If networking is underperforming, you can usually upgrade to network-optimized compute instances and increase the IOPS available to your cloud services

In the cloud, performance and cost are two sides of the same coin. In many cases, to remediate performance instances, you will need to increase costs. This makes it important to be aware of cloud cost parameters, the relative cost of different cloud services, current utilization, budgets and thresholds set by the organization.

Learn more in the detailed guide to cloud cost

AWS Monitoring

Amazon Web Services (AWS) is the most popular public cloud platform. While there are hundreds of AWS services, some of the most common metrics monitored on AWS are:

  • EC2 instance CPU and disk utilization
  • Application logs
  • Load balancer logs
  • Virtual Price Cloud (VPC) flow logs

AWS provides several monitoring services, including:

  • AWS CloudTrail—tracks activity of users, services, and APIs usage across all AWS services. CloudTrail stores event logs of any action performed in an AWS user account, providing visibility into resource activity.
  • AWS CloudWatch—provides dashboards and analytics capabilities including anomaly detection, incident response automation, support for troubleshooting, and operational data.

Learn more in the detailed guide to AWS monitoring.

Azure Monitoring

Microsoft Azure is another market-leading public cloud offering. Common performance metrics on Azure include:

  • Availability and response rate of Azure services
  • Network performance
  • Storage capacity and scalability on Azure storage services
  • Processing capacity on Azure VMs, including availability, utilization and responsiveness

Azure provides a number of first-party monitoring services:

  • Azure Monitor—collects performance metrics, diagnostic and activity logs from all Azure services, and provides dashboards to visualize the data.
  • Azure Advisor—scans cloud configurations and provides recommendations for improving availability, performance, security, and cost.
  • Service Health—checks for critical maintenance issues on customer applications and services.
  • Network Watcher—monitors network performance for Virtual Networks (VNet), virtual machines, and application gateways.
  • Resource Health—diagnoses and provides support for problems in Azure services.

Learn more in the detailed guide to Azure monitoring.

Google Cloud Monitoring

Google Cloud Platform is Google’s public cloud service. It offers Google Cloud Monitoring, a service that provides data about the performance, availability, and health of your cloud applications.

Google Cloud Monitoring is based on three foundations:

  • Metrics—the Google Cloud Console and Cloud Monitoring API gives you access to over 1,500 cloud monitoring metrics (see the full metrics list). You can also create your own custom metrics.
  • Alerts—inform your team about problems with cloud applications. Alerts can be managed via Google Cloud Console, Cloud Monitoring API, and Cloud SDK.
  • Dashboards—you can create dashboards to visualize Google Cloud metrics using the Google Cloud Console, or programmatically via the dashboards API endpoint.

Serverless Performance

Serverless computing is a way to design and deliver software with no knowledge of the underlying infrastructure—computing resources are provided as an automated, scalable cloud service.

In a traditional data center, or when using an infrastructure as a service (IaaS) model in the cloud, the server’s computing resources are fixed, and paid for regardless of the amount of computing work the server performs. In serverless computing, billing is performed only when the client’s code is actually running.

Serverless computing does not eliminate servers, but its purpose is to remove computing resource considerations from the software design and development process.

However, the serverless model raises significant challenges with regard to performance testing and application debugging:

  • No access to the server or operating system, making it difficult to identify and diagnose performance issues
  • A highly dynamic environment, with short-lived serverless functions running for minutes or seconds
  • A highly distributed environment with complex dependencies
  • Concurrency and throttling are handled by the cloud provider, and can have a major impact on end users, while they are difficult to predict

The main impact of these challenges is low observability over what is running and how workloads are performing.

AWS Lambda Performance Tuning

Amazon Lambda is the most popular serverless platform today, with 96% market share according to the BMC State of Serverless report. Therefore, understanding how to diagnose and tune performance issues in Lambda is top of mind for any serverless practitioner.

We’ll briefly cover three ways to improve performance in AWS Lambda.

  1. Memory and CPU utilization

AWS Lambda allocates memory to serverless functions, ranging from 128 MB to 3,008 MB in 64 MB increments. At the same time, CPU power is allocated to the function in direct proportion to the amount of memory. When the allocated memory exceeds 1,792 MB, Lambda gives the function access to one full vCPU of processing capacity.

If your application is single-threaded, make sure never to use more than 1.8 GB of RAM, because when additional vCPUs are added, you will pay more but the app will not be able to make use of them. Conversely, if you use less than 1.8 GB of RAM, and the application is multi-threaded, you will not be able to make use of multi threading to improve performance.

  1. Balance between memory and execution time

AWS Lambda cost depend on memory allocation and execution time. If you reduce Lambda execution time (for example by making your function more efficient), you can increase memory (and thus increase CPU) to speed up processing. However, AWS currently offers up to 2 vCPU cores per function, so beyond a certain point, increasing memory will not reduce execution time.

If your application is CPU intensive, increasing memory can significantly reduce execution time and cost per execution.

  1. Set a reserved concurrency limit

AWS Lambda handles scalability automatically. But there are limits to the number of concurrent requests it allows. When optimizing Lambda performance, you should consider limiting concurrent executions.

There are two ways to limit concurrency in AWS Lambda:

  • Account level—by default, Lambda accounts allow 1,000 concurrent functions to run
  • Function level—by default, any function can use the full account-level limit of 1,000 concurrent sessions. However, you should limit each of them to a reasonable amount of your total available concurrency. Otherwise other functions will receive throttling errors.
  1. Manage cold starts

A unique problem in serverless environments is cold starts—when a function is invoked, it takes time for it to power up and start serving user requests, and in the interim, users may experience latency.

AWS has introduced the concept of provisioned concurrency, which can help resolve this problem. It lets you provision a preset amount concurrency in Lambda, ensuring that you have an appropriate amount of function instances running, ready to serve user requests with no delay. You can change provisioned concurrency automatically using CloudWatch metrics, or by defining a schedule based on known application loads.

Learn more in our detailed guide to AWS Lambda performance

Serverless Monitoring

Serverless systems are a “black box”—engineers do not have knowledge about their inner workings. Serverless function code is only executed during the request, and it is not known where exactly it is executed on the hardware.

However, it is possible to monitor and resolve issues in serverless, it just requires looking at different metrics than you are used to in a traditional server-based environment.

What Should We Be Monitoring?

Push data

This is the traditional style of monitoring, where an agent running on a server collects data and saves it to a central server. In serverless, you don’t have access to the server, so monitoring should be built into the serverless function itself. This means that the developer has to include the monitoring library as part of their code, and initialize it while the process is running.

Push data monitoring in a serverless environment can help you collect data like function execution time, memory usage, recording user experience, function payloads, network performance and database performance.

Pull data

Pull data monitoring is a new type of monitoring, in which services are built to report metrics on their own. Because services are temporary and not running statically, it can be effective to build telemetry functions into the service, and have it report essential metrics. This requires careful planning, because there can be multiple services running in diverse locations at different times.

4 Key Serverless Monitoring Metrics

Here are some of the key metrics you should measure for serverless functions:

  1. Calls—the number of times function code was executed, including successful executions and executions with functional errors.
  2. Errors—number of times a function did not execute successfully. Errors include exceptions thrown in your code, and exceptions thrown by the serverless platform, mainly related to timeouts and configuration errors.
  3. Throttle—number of functions that did not run due to throttling. Serverless platforms impose a limit on concurrency. If the maximum number of functions allowed are already running, additional invocations are not allowed, until there is available capacity. This is different from an error because the function failed to run due to a resource limitation.
  4. Duration—the time it takes for function code to process the event. For the first event handled by the function instance, this includes the initialization time.

Learn more in our detailed guide to serverless monitoring

Serverless Debugging

Serverless applications are highly fragmented. You will not always have the ability to run these components locally. Distributed architecture can cause problems in many areas of the stack. The ability to drill down into code is critical to quickly fixing defects.

Remote debugging is difficult in serverless applications, because you do not have access to the server and operating system. It can also incur costs, because you’ll need to spin up instances of a function in order to test them. Developers often don’t understand where the problem is and why it happened.

In order to effectively debug serverless applications, it is essential to have dedicated tools that can provide information about what is happening in the environment and where the problems lie. In particular, it is important to get access to stack traces of serverless functions that incurred errors, to be able to debug and resolve production issues. Serverless observability tools have been developed to address these challenges.

Learn more in our detailed guide to serverless debugging

Serverless Observability Tools

The following tools can help you monitor and debug serverless functions in the most popular serverless platform, AWS Lambda.

AWS Cloudwatch

The primary source of information about how AWS applications work is AWS CloudWatch Logs. This also applies to Lambda functions. By default, Lambda functions send data to CloudWatch, and CloudWatch creates a LogGroup for each function. Each LogGroup has multiple log streams, which contain the logs generated for each Lambda function instance.

Log streams contain log events. Click on an event to see more information about the specific Lambda function that was called. This is the most basic way to debug issues in a Lambda function.

AWS X-ray

CloudWatch log events are useful, but limited in the information they provide about serverless issues. AWS built X-Ray to answer more complex questions like:

  • Which production issues are resulting in performance problems
  • Which specific integrations are resulting in application issues

X-Ray creates a mapping of application components, and enables end-to-end tracking of requests. It can be used to debug applications both during application development and in production.

Lumigo

The Lumigo platform is the leading monitoring and debugging platform for serverless and microservices applications. It deploys in minutes, with no code changes, and enables:

  • Removing performance bottlenecks—see which services run sequentially or in parallel, and automatically identify your worst latency offenders. 
  • One-click distributed tracing—lets developers effortlessly find and fix issues in serverless and microservices environments.
  • Visual debugging—virtual stack trace of all services participating in the transaction. Everything is displayed in a visual map that can be searched and filtered.
  • Serverless-specific alerts—predictive analytics identifies and alerts on issues before they impact application performance or costs.

Learn more about the Lumigo platform and get started free!

Kubernetes and Container Performance

Containers are a crucial part of the cloud native environment, and a foundation of most DevOps environments. They are a lightweight encapsulation of software and configuration, which makes it easy to deploy applications and IT systems in an automated and repeatable manner. Docker is the de-facto standard for container engines, and Kubernetes is the most popular orchestration tool used to manage large numbers of containers.

According to the 2020 CNCF Survey, 92% of cloud native users run containers in production, and 83% of them use Kubernetes in production. 23% of organizations have over 5,000 containers, and 12% say they have over 50 Kubernetes clusters running in production—indicating growing enterprise use of containers. In production and large enterprise deployments, performance considerations become critical.

Container Performance Considerations

One of the important things to understand is that containers are not virtual machines. A virtual machine runs as a software representation of a computer that is independent of the physical host, but containers depend on the host’s operating system, kernel and file system.

This means that, for example, a virtual machine can be hosted on a computer running Windows, while its workloads run Linux, or vice versa. On the other hand, containers running on the default Linux host must run Linux, because they share the resources of the underlying operating system kernel.

A container running on a host appears to be completely isolated from other containers. But internally, all containers running on a particular host use that host’s kernel and file system. This shared usage of host resources has a profound effect on performance optimization—which only gets more complicated when you run orchestrators like Kubernetes.

Learn more in our detailed guide to containerized architecture

Key Kubernetes Performance Metrics

Broadly speaking, Kubernetes runs two types of containers:

  • Master nodes—running infrastructure components such as cluster configuration and management
  • Worker nodes—these are containers running your workloads

Containers are organized into pods, which are deployed on physical hosts called nodes. For both types of containers, the following three metrics are crucial for evaluating performance as part of a Kubernetes deployment.

1. Memory Utilization

Monitoring memory usage at the pod and node level can provide valuable insight into cluster performance and the ability to successfully run workloads. Pods whose physical memory usage exceeds the predefined limit will be shut down. Also, if a node is running out of available memory, the kubelet marks the node as out of memory and starts reclaiming its resources.

2. Disk Utilization

Like memory, disk space is a critical resource for each container. So if kubelet detects that the root volume is running out of disk space, pod scheduling issues can occur. On specific nodes, when available disk space goes below a certain threshold, the node is flagged as having “disk pressure”, and kubelet may also reclaim node level resources.

You also need to track the usage level of storage volumes used by Kubernetes pods. Storage volumes provide persistent storage, which survives even after a specific container or pod shuts down. This allows you to predict problems at the application or service level.

3. CPU Utilization

Tracking the number of CPUs used by pods and nodes, compared to the configured requirements and limits, can provide valuable insights into cluster performance. If there are insufficient CPU resources available at the node level, the node will throttle the CPU resources available to each pod, which can cause performance issues.

Kubernetes Performance Best Practices

Here are a few best practices you can use to improve performance of applications running on Kubernetes:

  • Deploy clusters close to customers—the geographic location of Kubernetes nodes has a major effect on latency experienced by clients. Leverage cloud resources to move clusters physically near to end users.
  • Carefully select persistent storage resources—on storage devices defined as Kubernetes storage volumes, prefer SSD drives, and if possible, use NVMe for heavy workloads. When using cloud services, opt for premium service tiers that provide better performance and higher IOPS.
  • Optimized your images—Kubernetes runs large quantities of the same images. It is critical to optimize images to ensure they are lightweight and run quickly and effectively. Any redundant code or software in an image can be multiplied thousands of times in a Kubernetes environment.
  • Optimize the etcd cluster—etcd is a critical component that stores configuration for Kubernetes clusters. Monitor etcd clusters carefully to ensure they are healthy and have sufficient resources. etcd should be colocated, or at least connected with a fast network connection, to the kub-apiserver that serves requests from pods.

Learn more in the detailed guide to Kubernetes in production

Kubernetes Monitoring

Here are a few best practices and considerations for monitoring Kubernetes deployments. You should establish careful monitoring at both the cluster and pod level.

Kubernetes Cluster Monitoring

The purpose of cluster monitoring is to monitor the health of the entire Kubernetes cluster. As an administrator, you need to know if all nodes in the cluster are functioning normally, the workload capacity they are running, the number of applications running on each node, and the resource utilization of the entire cluster.

  • Node resource utilization—important metrics include network bandwidth, disk usage, CPU and memory usage. You can use these metrics to determine whether to increase or decrease the number and size of nodes in the cluster.
  • Number of nodes—the current number of nodes available is an important metric to follow. This will give you an idea of cloud costs of the Kubernetes deployment, and what scale of workloads you can leverage the cluster for.
  • Running pods—the number of running pods indicates whether there are enough nodes available and whether they can handle the entire workload in case of node failures.

Kubernetes Pod Monitoring

You can use Kubernetes metrics to monitor how specific pods and their workloads are behaving. Pay special attention to:

  • The number of instances of each type of pod—if the number is small, the cluster may have run out of resources
  • Deployment progress—transition of instances from old to new
  • Health checks
  • Network data reported through pod network services

Container metrics are primarily provided by the cAdvisor utility which comes with Kubernetes. For more extensive monitoring capabilities, a common choice is Prometheus, an open source monitoring tool built for cloud native environments.

Learn more in our detailed guide to Kubernetes monitoring tools

Web and Mobile Performance Testing

Importance of Web Application Performance Testing

Web application developers today can automatically push their code through build, test, and deploy, but are not always sure how the code will perform in production. Web application performance testing is the solution to this problem, and should be an important part of your testing strategy.

A poorly performing website or web application will also do worse on SEO and subsequently get less traffic than competitors, and is likely to have lower engagement, lower conversion, and lower revenues. Effective website and web application performance testing can improve all these metrics by ensuring the development team pays consistent attention to performance.

Learn more in the detailed guide to web performance

Website Performance Testing Tools

Here are a few tools web application developers can use to test and improve performance on an ongoing basis.

Pagespeed Insights

Pagespeed Insights by Google checks several on-page and back-end factors of a web page, and reports on their effect on page load time. It provides a performance score for desktop and mobile, shows which elements have the biggest impact on page load time, and provides suggestions for improvement.

GTMetrix

GTmetrix, a free tool which is based on the Google Lighthouse performance benchmark, provides five different reports showcasing website performance:

  • PageSpeed
  • YSlow
  • Waterfall breakdown
  • Video showing page load
  • History of page performance

GTmetrix helps visualize page performance and lets you set up alerts to notify about performance issues. It also provides extensive support for testing performance on mobile devices.

Pingdom

Pingdom is a commercial offering that monitors website uptime, page speed, real user monitoring (RUM) showing how actual visitors experience your web pages, and synthetic session monitoring. Pingdom has a global infrastructure with testing servers in 100 countries. It provides notification by email, SMS, and integrates with collaboration and alerting tools like Slack and PagerDuty.

WebPageTest

WebPageTest is a free tool that lets you test your web pages from 40 locations, using 25 browsers. It performs in-depth performance tests covering topics like:

  • Bandwidth usage
  • Compression and caching
  • Use of CDN
  • DNS lookups
  • User experience optimization

Cloudinary Website Speed Test

Website Speed Test analyzes a website’s images, which are responsible for a large percentage of load time on most web pages. It inspects image format, fit, compression, and quality options, and provides suggestions for optimizing images, which can have a dramatic impact on page load time. The image analysis tool is integrated with WebPageTest.

Website Performance Best Practices: The Basics

Here are several simple best practices that can improve performance for your web pages.

Enable Compression

Use GZip to reduce the size of CSS, HTML and JavaScript files that exceed 150 bytes. However, don’t use GZip for image files. Instead, use image formats that enable compression, and set the appropriate compression ratios to reduce file size while retaining quality.

Minify CSS, JavaScript, and HTML

Optimizing your code by removing spaces, commas and other unwanted characters, as well as comments, formatting, and unused code, can significantly speed up your page. To compress text files even further, use minification and uglification frameworks such as CSS Nano and UglifyJS.

Reduce Redirects

Whenever a page is redirected to another page, the visitor faces a delay waiting for the HTTP request/response cycle to complete. It is not uncommon to see web pages redirected over three times, and each redirect adds a delay for the user. Not to mention redirect loops and errors that can result in a broken user experience.

Improve Server Response Time

Server response time is influenced by the amount of traffic received, the resources used by each page, software running on the server, and the hosting solution used. To speed up server response time, find and fix performance bottlenecks such as delayed database queries, slow routing, and insufficient memory or CPU resources on the server. Aim for a server response time of under 200ms.

Optimize Images

Resize images to the required size before using them on a web page, and provide several versions of images for responsive designs. Ensure image files are compressed for the web. Use CSS sprites to combine images like buttons and icons into one large image—the file then loads immediately with fewer HTTP requests, and the web page shows only the relevant portions. This can significantly reduce page load time.

Learn more in the detailed guide to image optimization

Avoid Performance Impact of CSS Image Effects

CSS is commonly used to style and transform images on websites. Using CSS commands, developers can adjust position for images, resize images, add backgrounds or borders, and apply filters like grayscale or blur.

In general, CSS image effects have a negative effect on page performance. There are two main performance concerns:

  • CSS effects are processed on the client side – it is always more efficient to pre-process an image on the server and then have it download as is to the client, rather than downloading a raw image, and then having the client use local computing resources to transform the image.
  • Use of CSS to resize oversized images – if a website stores one high resolution image, and adjusts it to the relevant screen size on the client side using CSS, this can dramatically increase page load times. Images must be resized on the server side, to avoid downloading needlessly large image files to the browser.

Learn more in the detailed guide to CSS images

Advanced Website Performance Optimization

Modern Image and Video Formats

Use next-generation image and video formats, which can provide much better compression ratios with higher quality:

  • WebP—created by Google, preserves image quality with a much higher compression rate than traditional formats like JPG. Supports both lossy and lossless compression.
  • JPEG 2000—an updated version of JPEG with enhanced lossless compression. Also supports video.
  • AVIF—an image version of the AV1 video format, providing lossy and lossless compression that produces files 10X smaller than comparative JPEG files.
  • JPEG XL—a new version of JPEG based on the Free Lossy Image Format (FLIF) and Pik format. Compresses image files up to a third of the weight of traditional JPEG.

Learn more in the detailed guide to next-generation image formats

Image and Video Optimization

You can optimize website images and videos by selecting the most appropriate format and compression, and delivering media files efficiently using content delivery networks (CDN).

There are many CMS plugins and tools available that can help automate image and video optimization. These plugins let you automatically convert images to the most appropriate format, and apply the best quality parameters to reduce file size while retaining quality.

Learn more in the detailed guide to video optimization

Video APIs

Video content is increasingly used on web pages and can be a major component in page load time. There are many ways to optimize video content to improve performance and user experience, but some of them are complex and may require dedicated software or hardware.

Cloud-based video services offer video APIs, which offer advanced video optimization capabilities on the fly:

  • Video encoding and delivery – video APIs can automatically transcode or encode videos to a required format, reducing file size while improving quality.
  • Automated video adjustment – video APIs can modify video content to suit a required graphic design or screen size. Some services provide AI capabilities that can dynamically resize video to focus on the interesting elements in the frame.
  • Video effects – video APIs can automatically apply effects without needing to edit the source video files. For example, they can apply graphical filters, overlay text or images, and cut or loop parts of the video.

See an example of a cloud-based video API by Cloudinary and 6 more APIs with advanced AI capabilities

Lazy-Load Images

Lazy loading is a common and effective technique, which involves only loading images only when the website visitor needs to see them.

Typically, the technique detects the user’s viewport, and loads images as the user scrolls down the page and sees them. This can significantly reduce the amount of data loaded to the user’s browser when they initially visit a page, and conserve bandwidth because images that are never viewed by the user do not need to be downloaded.

Learn more in the detailed guide to lazy loading

See Additional Guides on Key Performance Testing Topics

Lumigo, together with several partner websites, has authored a large repository of content that can help you learn about many aspects of performance testing for cloud native and web applications. Check out the articles below for objective, concise reviews of key data security topics.

Serverless Monitoring

Authored by Lumigo

Learn how to monitor serverless applications in production, making them observable and easy to maintain and troubleshoot.

See top articles in our serverless monitoring guide:

●      How to Monitor Lambda with CloudWatch Metrics

●      Lambda Logs: a Complete Guide

●      Using CloudWatch Logs for Lambda Monitoring

Serverless Debugging

Authored by Lumigo

Learn how to debug serverless applications. Understand the differences between debugging for monolithic and microservices apps, and understand serverless testing.

See top articles in our serverless debugging guide:

 ●      Serverless Testing: Adapt or Cry

●      AWS Lambda Unit Testing

●      AWS Serverless Timeouts: What are They and How to Fix and Prevent Them

AWS Lambda Performance

Authored by Lumigo

Learn how to optimize AWS Lambda performance, overcoming challenges like short-running functions, timeouts, and cold starts.

See top articles in our AWS Lambda performance guide:

 ●      AWS Lambda Timeout Best Practices

●      How to Improve AWS Lambda Cold Start Performance

●      Understanding AWS Lambda Concurrency

AWS Monitoring

Authored by NetApp

Learn to monitor workloads on AWS using first-party and third-party tools, and discover best practices for performance and cost optimization.

See top articles in the AWS monitoring guide:

 ●      Cloudwatch Log Insights: Ultimate Quick Start Guide

●      Monitoring the Costs of Underutilized EBS Volumes

Image Optimization

Authored by Cloudinary

Learn how to optimize images and use compression, quality settings, CDN and other techniques to dramatically reduce page load time.

See top articles in the image optimization guide:

 ●      PHP Image Compression, Resize & Optimization

●      Three Popular and Efficient Ways to Loading Images

●      Python Image Optimization and Transformation

Image Formats

Authored by Cloudinary

Discover next-generation image formats that can help you improve web performance. New formats like WebP and JPEG-XR deliver high quality with improved compression.

See top articles in our image formats guide:

 ●      The Great JPEG 2000 Debate: Analyzing the Pros and Cons to Widespread Adoption

●      Adopting the WebP Image Format for Android on Websites Or Native Apps

●      Optimizing Animated GIFs With Lossy Compression

Video Optimization

Authored by Cloudinary

Learn how to optimize video content for higher performance and improved user experience, by using the latest compression and streaming technology, and automatically adjusting videos to user requirements.

See top articles in our video optimization guide:

●      Tips for Retaining Audience Through Engaging Videos

●      Product Videos 101: What Makes Them Great?

●      Automated Generation of Intelligent Video Previews on Cloudinary’s Dynamic Video Platform

Additional Performance Testing Resources

Below are additional articles that can help you learn about data security topics.