Aug 23 2022
You’ve probably seen Rush Hour, a logic puzzle where you have to slide cars and trucks out of the way to steer the red car towards the exit.
In real life, when your customers are responsible for tracking hundreds or thousands of data points from dozens of valuable, mission-critical sensors, you’re tracking engine speed, network signal level, distance from the RF, and more—and not just through traffic but across continents. This is no game, and you can’t afford to leave anything to chance.
Whether it’s rental cars or public transportation, delivery trucks, trailers, or any other vehicle, fleet operators need to keep their finger on the pulse of their operations. So the more data your app can provide, the better.
As a 2020 Deloitte report stated, “Analyzing the data produced by telematics devices can provide fleet managers with insights across the entire fleet lifecycle.” There are a couple of reasons your customers need these insights more than ever:
- Fleets have been hit hard over the last few years by pandemic labor shortages, rising fuel costs, and supply-chain issues that make replacement parts or new vehicles expensive or even impossible to obtain.
- These factors eat into margins, which weren’t large to begin with, so fleet managers need fleet management software that will help them keep up and optimize wherever possible.
The fleet management software industry is growing rapidly, from $20.73 billion in 2022 to an estimated $67.38 billion by 2029. But because of the unique challenges of this industry, the traditional cloud-based model doesn’t always work well for fleet management apps.
In this post, we’ll explore some of those challenges and discover a new way to take advantage of a serverless computing model to achieve scale and reduce costs.
Unique Fleet Management Challenges
The main characteristic of fleet management platforms is the vast number of data points they must be able to handle at any given time.
Most other types of platforms ingest data from just a few IoT, industrial, or other types of devices. For example, a diabetes-management app might receive regular updates on patients’ blood sugar levels. Even if these updates are frequent, they are fairly predictable (within a known range), and it’s easy to predict, scale, and provision infrastructure based on a known number of users.
In contrast, fleet management platforms must handle a flood of incoming data, including…
- Vehicle location, direction, speed, odometer
- Vehicle condition, health, safety, predictive maintenance
- Scheduling and routing, timeliness, idle time
- Driver behavior
- Fuel consumption
- Temperature data, cargo condition, and a wide range of other metrics
Plus, there are always new advances coming down the line, and these generally mean new data sources and types, such as camera data or integrated accident management.
This data is fleet operators’ greatest asset. As the Deloitte report asserts, it’s “the fuel for more effective management and vehicle readiness.”
But it’s not always easy building a platform to handle these wide-ranging data points. You need compute and storage that can keep up with the demands of fleet management apps, including:
- High velocity of invocations (10+ million per day)
- “Spiky” data, meaning many requests coming in when the fleet is transmitting data; otherwise, it’s almost nothing
From On-Premises to Cloud to Serverless
The way fleet management platforms have handled that data has evolved over the last few years.
Traditionally, they relied on large in-house data centers to accomplish this. But gradually, vendors have shifted those data centers to the cloud, hoping to save money and harness cloud-provider efficiencies.
However, under the spiky demand mentioned earlier, running all the servers necessary to handle inconsistent data can be very expensive and wasteful.
At the same time as applications were shifting to the cloud, developers were also moving from an older monolithic model to a more agile and compartmentalized distributed model. These generally rely on containers and microservices, which are flexible by nature but require greater scalability.
To achieve that scalability, fleet management developers turned to a new model: serverless applications. The term serverless is slightly deceptive. It doesn’t mean there are no servers. It simply means that the cloud provider takes care of infrastructure provision behind the scenes so your team doesn’t have to think about it in production.
All major cloud providers now offer serverless. For AWS, this is Lambda, an event-driven, function-as-a-service (FaaS) computing platform that automatically manages all computing resources your code requires. With Lambda, you don’t have to worry about the underlying platform or infrastructure because AWS takes care of all of that for you
With infinite and precise scalability, serverless solves many problems for fleet management:
- Fast to deploy
- Shorter time to market
- No DevOps required
Plus, with serverless, as your data needs grow—e.g., incorporating data analytics and AI to drive better strategic decision-making—you’ll be able to grow your platform quickly without any extra infrastructure investment.
This also lets you remain agile and respond fast to industry changes and customer demands.
But serverless also comes with challenges…
Challenges in Serverless Production
Serverless is great for dealing with demand peaks and massive data ingestion, but it has a few disadvantages—primarily surrounding the observability of services and connections between them:
- When everything is working perfectly, serverless is great. But getting there can be a challenge…
- And when something goes wrong, serverless can be very difficult to deal with.
Each and every service you’re running generates a large volume of observability data (logs, metrics, and traces), but getting at the data you need to fix a specific problem is hard. Understanding where your problem is coming from can take far longer than necessary.
Without proper end-to-end observability, companies waste time, money, and end-user goodwill finding and fixing bugs, while development slows and user experience suffers.
And then there are also unique challenges when it comes to adapting fleet management apps to a serverless environment:
- Lambdas typically run for a very short period, which can lead to problems with cold starts that can slow down your application.
- A high number of concurrent flows makes end-to-end tracking difficult.
- Much of the work is done via batch processing. Applications are usually many-to-many applications, with numerous directly triggered endpoints, such as API gateway, S3 buckets, etc., that provide little information for end-to-end monitoring and observability.
To resolve problems fast in a production environment, you need three things:
- Monitoring. This includes alerting and notification when something fails (e.g., when a threshold percentage of requests fail); it also covers the number of requests/invocations/cold starts, as these can all impact performance.
- Impact assessment. Understanding impact helps you prioritize. Is it business-critical? This involves connecting the dots and seeing the big picture. (Due to loose connections, a given microservice won’t know if it is mission-critical, but if you realize, for instance, that it’s part of a process-payment API that isn’t working, you’ll know that it is indeed critical.)
- Troubleshooting. This entails the ability to drill down into a specific service, request, or transaction—checking all the logs and environment variables to help understand the cause, tracing the route of the request through all the microservices involved, and working your way upstream through every single service to track down that specific request (out of potentially billions).
Unless you have these three things in place, you’re going to waste time tracking down problems that may not even be mission-critical.
Achieving Better Serverless
When it comes to creating observability for AWS Lambda serverless architectures, you have three options.
DIY Using Free/Open-Source Tools
There are a few open-source observability frameworks currently available, such as Jaeger and OpenTelemetry (OTel). These include a variety of tools, APIs, and SDKs to help make serverless debugging easier. This option is popular for its price point, but these tools tend to be vendor-agnostic, meaning they’re not designed around the specific requirements of AWS Lambda or integrated with native AWS tools like X-Ray.
Using Native Cloud-Provider Tools
If you’re using Lambda, then you’ll have a few services available directly from AWS, such as X-Ray and CloudWatch (which includes CloudWatch Logs and CloudWatch ServiceLens). However, both X-Ray and CloudWatch are limited in a few ways. They provide limited free metrics (that can be difficult to configure), along with more advanced metrics at an additional cost. Plus, the logs these services provide can be difficult to read, categorize, and search, making filtering out the noise to get at the information you need quickly a challenge.
Third-Party Applications
Recognizing the shortcomings of cloud-provider and open-source tools, a number of vendors now provide solutions designed to simplify observability, monitoring, and tracing.
While all of these options have advantages, the first two—the DIY and native cloud-provider routes—will take significant engineering and development and may impact production.
The simplest and most hassle-free solution if you’re looking for less effort and the fastest implementation (with less downtime) may be a third-party solution that works seamlessly with AWS. However, some third-party solutions require more R&D and code changes than others, so it’s important to choose a tool that will get up and running fast while actually saving you work in both the short and long term
Full Serverless Observability and Debugging
Lumigo is a serverless observability platform that helps you with monitoring and troubleshooting in a distributed environment. It helps you find and fix problems fast with a combination of smart monitoring, troubleshooting, and end-to-end tracing.
The best part is, Lumigo does its job without any code changes, so you can get up to speed faster than with a DIY or cloud-vendor solution alone.
With its one-click distributed tracing, Lumigo gathers all the metrics it needs from your cloud vendor and third-party services, including invocations, failures, highest-latency services, and cold starts, which are a common trouble spot in serverless.
When you need to see what’s happening inside your code, Lumigo automatically generates traces that show you exactly what’s going on:
- Third-party API (e.g., Twilio, Stripe, and managed services like S3, API Gateway, etc.)
- Flow of requests
- Parameters entering and leaving every service
- Gathering of logs
This lets you identify performance bottlenecks and then dig right in with built-in visual debugging tools that offer the granularity required to effortlessly find and fix problems fast in your Lambda environment.
When your customers are delivering real people, goods, and services in real vehicles in the real world—and collecting millions of data points at any given time—there’s not much margin for error. Your platform has to be able to keep up with handling all that data in real time, with the highest possible availability and lowest possible latency.
Lumigo lets you handle all the data your telematics and other devices can dish out. Get started fast by trying out our free tier and see how easy it is to enjoy all that AWS Lambda serverless has to offer when it comes to fleet management.