Exit code 143 in Kubernetes signals that a container was terminated by receiving a SIGTERM signal. This code indicates that the process within the container was stopped due to an external request, commonly initiated by Kubernetes for various operational reasons, such as pod deletion or scaling down.
The SIGTERM signal is a way for an operating system to notify a process that it should finish its current tasks and terminate gracefully, allowing for a clean shutdown of applications. Understanding this exit code is crucial to the operational nuances when working with Kubernetes, as it helps diagnose why a container stopped running and may highlight deeper issues within deployments.
Exit code 143 distinguishes intentional stops from unexpected errors, providing insights into the application’s behavior and the orchestration’s management actions.
This is part of a series of articles about Kubernetes troubleshooting
In this article
There are several contexts that can result in a SIGTERM signal.
In Kubernetes, scaling operations often lead to the generation of exit code 143. When a deployment is scaled down, Kubernetes must terminate excess pods to match the desired state. This action is initiated by sending a SIGTERM signal to the containers running in the scaled-down pods.
The signal serves as a notice for the applications to terminate gracefully, allowing them to complete current tasks and save necessary data before shutting down. This process ensures that scaling operations do not disrupt the application’s functionality or cause data loss.
When a node in the cluster runs out of resources like memory or CPU, the Kubernetes scheduler may decide to evict pods to stabilize the node’s condition. This eviction process involves sending a SIGTERM signal to the containers, signaling them to shut down gracefully to free up resources.
When a pod is manually deleted or replaced during a rolling update, Kubernetes sends a SIGTERM signal to the containers as part of the process to gracefully remove the pod from service. This allows the application to terminate connections, complete in-progress tasks, and ensure a smooth transition to new pods without losing critical information or disrupting the service.
The SIGKILL signal is another way to terminate processes, but it does not allow processes to gracefully shut down. Here are the key reasons Kubernetes always uses SIGTERM and not SIGKILL to terminate processes:
SIGTERM gives processes the opportunity to clean up their data and perform necessary shutdown procedures before termination. This approach is vital for maintaining data integrity and ensuring that applications can resume operations smoothly after a restart. It provides a controlled environment for the application to close connections, save state, and release resources, minimizing the risk of data corruption or loss.
The safety provided by SIGTERM helps prevent environmental corruption. By allowing applications to shut down gracefully, Kubernetes minimizes the risk of leaving the system in an inconsistent state. This is crucial for complex applications that manage significant amounts of data or maintain persistent connections. The orderly shutdown process facilitated by SIGTERM protects against data loss and ensures that resources are cleanly released.
If a pod does not respond to the SIGTERM signal within a specified grace period, Kubernetes escalates the termination by sending a SIGKILL signal. This ensures that unresponsive processes are forcefully terminated, maintaining the cluster’s stability and performance. Even if a graceful shutdown isn’t possible, stuck or unresponsive processes do not hinder the cluster’s functionality.
Here is an overview of the Exit Code 143 process in Kubernetes.
When a pod in Kubernetes is marked for termination, its status is set to Terminating. This status indicates that the pod has received a shutdown request, typically through a SIGTERM signal. During this phase, the pod remains in the cluster but stops receiving new requests.
This status allows the system to manage resources effectively while providing the pod the opportunity to close gracefully, ensuring that ongoing tasks are completed and resources are properly released.
The preStop hook in Kubernetes provides a way to execute custom commands or scripts before a pod is terminated. This hook is triggered after the SIGTERM signal is sent but before the process is forcibly stopped with SIGKILL. This allows for the definition of cleanup or preparatory actions that should be taken immediately before the shutdown.
This mechanism is crucial for complex applications that require specific steps to ensure data integrity and application state before termination.
The SIGTERM signal is the initial step in the graceful termination process, telling the processes within the pod to shut down. It marks the beginning of the termination procedure, giving processes the chance to conclude their operations in an orderly fashion.
Following the SIGTERM signal, Kubernetes provides a grace period, a predefined amount of time for processes to shut down gracefully. If the processes do not terminate within this period, Kubernetes escalates to sending a SIGKILL signal, forcibly terminating the processes.
The grace period is configurable, allowing developers to specify the time needed for their applications to shut down properly, depending on their complexity and shutdown requirements.
If a process does not terminate after the SIGTERM signal and the grace period elapses, Kubernetes sends a SIGKILL signal to the pod. This signal forcefully stops the process, ensuring that resources are freed and the pod is removed from the cluster.
While SIGKILL effectively guarantees the termination of the pod, it bypasses the graceful shutdown process. It is a “last resort” in the Kubernetes termination process.
Exit Code 143 is not an error—it can result from healthy Kubernetes operations, such as normal scaling operations. However, sometimes containers return Exit Code 143 in an unexpected manner. Here are a few ways to reduce unwanted Exit Code 143 responses.
Logs are useful for diagnosing and understanding termination events. By logging signals like SIGTERM, developers can gain insights into the termination process, including timing and context. This information is invaluable for troubleshooting issues related to graceful shutdowns and for improving applications’ resilience against disruptions.
Running a single application per container is a best practice that simplifies management and enhances the predictability of container behavior upon receiving a SIGTERM signal. This approach ensures that each container has a focused purpose, making it easier to manage lifecycle events and signal handling. It facilitates cleaner shutdowns, restarts, and more straightforward resource allocation and scaling.
Properly configuring pod priorities, along with resource requests and limits, is essential in preventing unnecessary shutdowns due to resource contention or eviction. Specifying these configurations can ensure that critical applications have sufficient resources and are less likely to be evicted in favor of lower-priority workloads. This practice helps maintain application availability and performance, even under resource constraints.
Autoscaling in Kubernetes allows for automatic adjustment of resources based on demand, reducing the likelihood of exit code 143 due to resource shortages or overallocation. Autoscaling can adjust the number of pod replicas in response to current load, ensuring that applications have the resources to perform optimally while avoiding unnecessary resource consumption.
Kubernetes Troubleshooting with Lumigo
Lumigo is a troubleshooting platform that is purpose-built for microservice-based applications. Developers using Kubernetes to orchestrate their containerized applications can use Lumigo to monitor, trace, and troubleshoot issues quickly. Deployed with zero-code changes and automated in 1-click, Lumigo stitches every interaction between micro and managed service into end-to-end stack traces. These traces served alongside request payload data, give a more complete visibility into container environments.
Using Lumigo, allows you to get a more comprehensive view of your deployments, including:
To try Lumigo for Kubernetes, check out our Kubernetes operator on GitHub.