OpenTelemetry Operator: Automating Observability in Kubernetes

  • Topics

What Is a Kubernetes Operator? 

The OpenTelemetry Operator is a type of Kubernetes Operator. Kubernetes is a popular open source container orchestration platform.

A Kubernetes Operator is an application-specific controller that extends the functionality of the Kubernetes API. It does this by creating, configuring, and managing instances of complex stateful applications on behalf of a Kubernetes user.

Operators follow Kubernetes principles, notably the control loop. This means that they watch the state of your cluster, make or request changes, and then update the status of those changes back to the Kubernetes API. Kubernetes Operators can manage complex stateful applications, encode human operational knowledge, and automate common tasks.

What Is the OpenTelemetry Kubernetes Operator? 

The OpenTelemetry Operator is a tool used within the Kubernetes ecosystem to manage the deployment, configuration, and updates of the OpenTelemetry Collector.

OpenTelemetry is a tool that simplifies observability, including monitoring, tracing, and logging and makes it more accessible for the software development lifecycle. The OpenTelemetry Operator streamlines the deployment of the OpenTelemetry Collector within a Kubernetes cluster. This operator makes it easy to deploy, manage, and maintain collector instances, providing more visibility and manageability to your Kubernetes environment, which can improve the efficiency of your development process.

Related content: Read our guide to OpenTelemetry tracing

Features of the OpenTelemetry Operator 

Configuration Management

The OpenTelemetry Operator allows for the creation and management of OpenTelemetry Collector configurations directly from within your Kubernetes environment.

The OpenTelemetry Operator ensures that all instances of your OpenTelemetry Collector are consistently configured according to the specified configuration. This is done by fetching and applying configuration changes automatically without requiring manual intervention.

This feature is particularly useful in large-scale environments where managing individual instances can be challenging. It ensures that the OpenTelemetry Collector instances are always correctly and consistently configured.

Lifecycle Management

The lifecycle management capability takes care of the entire lifecycle of your OpenTelemetry Collector instances, from deployment to decommissioning, eliminating the need for manual intervention.

The OpenTelemetry Operator intelligently manages the lifecycle of your collector instances. It does this by automatically rolling out updates, handling failures, and ensuring the high availability of your collector instances.

Scalability and Reliability

The OpenTelemetry Operator allows for easy scaling of collector instances to meet the demands of your applications. It ensures that your collector instances are always available, regardless of the state of your Kubernetes environment.

The OpenTelemetry Operator enables you to scale your collector instances automatically, either up or down, based on actual loads. In terms of reliability, it ensures that your collector instances are always available, even in the face of failures. It does this by automatically restarting failed instances and maintaining a high-availability setup.

How the OpenTelemetry Operator and Collector Work 

The OpenTelemetry Operator and Collector work together to make your Kubernetes environment more observable. The Operator deploys and manages instances of the Collector, which collects and exports telemetry data from your applications.

The process begins with the OpenTelemetry Operator, which deploys instances of the OpenTelemetry Collector based on the specified configuration. The Operator ensures that the Collector instances are consistently configured and highly available.

Once deployed, the Collector instances begin collecting telemetry data from your applications. This data includes metrics, traces, and logs, which are then exported to various backends for analysis and visualization.

The OpenTelemetry Operator monitors the state of the Collector instances, ensuring they are functioning properly. If a Collector instance fails, the Operator automatically restarts it. If a configuration change is made, the Operator rolls out the update to all Collector instances.

Setting Up OpenTelemetry Operator and Managed Collector 

Create a Kubernetes Cluster

Before we start deploying the OpenTelemetry Operator, we first need to set up a Kubernetes (K8s) cluster where our operator will reside. The setup process varies depending on your environment — it could be a local Minikube for development purposes or a production-grade cluster on a cloud provider like Google Cloud, AWS, or Azure.

For this tutorial, let’s assume we are setting up a new cluster on Google Cloud. If you have the Google Cloud SDK and gcloud command line installed, you can create a new cluster using the gcloud container clusters create command.

gcloud container clusters create opentelemetry --zone us-central1-a

After running the above command, it will take a few minutes for your cluster to be ready. Once done, you can confirm if your cluster is ready by running kubectl get nodes. If everything is set up correctly, you should see a list of nodes in your cluster.

Deploying Cert-Manager

Once your Kubernetes cluster is ready, the next step is to deploy the cert-manager. cert-manager is a native Kubernetes certificate management controller. It can help with issuing certificates and will ensure they are valid and up to date.

To install cert-manager, we will use Helm, a package manager for Kubernetes. To install Helm, see the instructions on the project website.

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.5.3 --set installCRDs=tru

After you run the Helm command, it will take a few minutes for all the cert-manager components to be installed on your cluster. You can check the status of your cert-manager deployment by running kubectl get pods –namespace cert-manager

Installing the OpenTelemetry Operator

With the Kubernetes cluster and cert-manager ready, we can now deploy the OpenTelemetry Operator. The operator is responsible for managing the lifecycle of the OpenTelemetry components in the cluster.

To install the OpenTelemetry Operator, we will again use Helm.

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator open-telemetry/opentelemetry-operator --namespace opentelemetry --create-namespace

Once the operator is installed, you can verify its status by running this command: kubectl get pods –namespace opentelemetry.

Configuring and Deploying the Collector

The OpenTelemetry Collector is a component that receives, processes, and exports telemetry data. It’s part of the OpenTelemetry architecture and is typically deployed as an agent or a standalone service.

To deploy the collector, we need to create a configuration file that defines the data receivers, processors, and exporters. Here’s an example of a simple configuration:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-collector
spec:
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
    exporters:
      logging:
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [logging]

Save the above contents to a file named collector.yaml, then apply it to your Kubernetes cluster using kubectl apply -f collector.yaml.

Setting Up Auto-Instrumentation Resources

Auto-instrumentation is a feature that automatically captures telemetry data from your applications, without requiring you to manually instrument your code. Before proceeding, make sure the relevant CRDs are installed on your cluster.

To set up auto-instrumentation, we need to create an OpenTelemetryInstrumentation resource in our Kubernetes cluster. Here’s an example:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryInstrumentation
metadata:
  name: otel-instrumentation
spec:
  exporter:
    endpoint: "otel-collector:4317"
  propagators:
    - tracecontext
    - baggage

Save the above contents to a file named instrumentation.yaml, then apply it to your Kubernetes cluster using kubectl apply -f instrumentation.yaml.

Don’t forget to clean up resources by deleting the cluster on gcloud. You can use the following command:

       gcloud container clusters delete opentelemetry –zone us-central1-a

Microservices Monitoring with Lumigo

Lumigo is cloud native observability tool that provides automated distributed tracing of microservice applications and supports OpenTelemetry for reporting of tracing data and resources. With Lumigo, users can:

  • See the end-to-end path of a transaction and full system map of applications
  • Monitor and debug third-party APIs and managed services (ex. Amazon DynamoDB, Twilio, Stripe)
  • Go from alert to root cause analysis in one click
  • Understand system behavior and explore performance and cost issues 
  • Group services into business contexts

Get started with a free trial of Lumigo for your microservice applications

Debug fast and move on.

  • Resolve issues 3x faster
  • Reduce error rate
  • Speed up development
No code, 5-minute set up
Start debugging free