Zipkin is a distributed system for visualizing trace data within and between services. Initially developed at Twitter using Google’s open source Dapper project, Zipkin is now available as an open source tool.
You can use Zipkin to perform forensic investigations without recreating application flows from the log data. It consists of a Java-enabled architecture including four components:
Jaeger is a distributed software tracing system that monitors and troubleshoots microservices-based systems. Initially developed by Uber Technologies, Jaeger is now available as an open source tool.
You can use Jaeger for distributed context propagation and distributed transaction monitoring. It can help you run root cause analysis and service dependency analysis, as well as optimize performance or latency.
This is part of a series of articles about Zipkin.
In this article
Zipkin and Jaeger provide different support and framework options. Zipkin supports popular frameworks in official clients and lets the community instrument small libraries such as database drivers. Jaeger employs open tracing libraries for instrumentation, allowing you to use various contributed projects.
Zipkin and Jaeger both support drop-in implementation for major frameworks such as Python’s Django, Express.js in Node.js, and Java’s Spring. Jaeger also uses the opentracing-contrib project to provide instrumentation for database libraries, such as the AWS SDK, gRPC, and Thrift in various languages.
Key aspects employed by both Zipkin and Jaeger include:
Zipkin and Jaeger differ in how they package and deploy components. Here are key differences:
Jaeger is from the Cloud Native Computing Foundation (CNCF), using Kubernetes as the preferred platform to deploy applications. It offers official Helm charts and Kubernetes templates in the incubator deploying its collector, agent, UI, and query API. Additionally, it employs service proxies, such as Isito and Envoy, to support easier call tracing across containers.
Zipkin provides a unified process that encompasses all components, including a collector, data store, query API, and user interface, and lets you use Java programs and Docker images. It offers an easier deployment but its documentation is limited to a readme – it does not offer full deployment documentation like Jaeger does.
Zipkin and Jaeger are both systems that collect metrics exported from Prometheus. They let you offload data store maintenance to Elasticsearch or Cassandra and run the datastore yourself. However, this is where their architectural similarities end.
Zipkin architecture
Zipkin was written in Java, supporting Java 6 and later versions. It uses Apache Thrift, a binary communication protocol, and can leverage both Elasticsearch and Cassandra as a scalable back end. Zipkin is a unified process that includes all components, making deployment easier. It delivers data to collectors through HTTP, Kafka, or Scribe.
Jaeger architecture
Jaeger uses Golang to prevent dependencies from being installed on the host and avoid language or interpreter virtual machine (VM) overhead. It uses a similar architecture to Zipkin, using a query service, web UI, collectors, and clients. However, it is not a single process, and it deploys an agent on each host to aggregate data locally.
Jaeger’s agent works by receiving data over a user datagram protocol (UDP) connection, batches the data, sends it to a collector, and stores it in Cassandra or Elasticsearch. Its query service can directly access the data store and pass the information to the web UI.
Jaeger samples 0.1% of the traces passing through each client using probabilistic sampling. It builds on adaptive sampling, adding additional context to improve decisions. It also lets you alter the percentage of traces by re-configuring the agent’s size.
Both Zipkin and Jaeger have limitations you should be aware of.
Zipkin limitations
Jaeger limitations
Related content: Read our guide to Zipkin Spring Boot (coming soon)
Both Zipkin and Jaeger are excellent choices for collecting and managing distributed tracking data, and both have similar functionality. Both of them offer:
Here are some guidelines for choosing the best tool for your project:
Zipkin is a more mature tool, with a bigger and more active community. It is used in a wide range of industries and is well suited for the enterprise IT world, which primarily uses Java. Choose Zipkin if you want to go with the mainstream, widely adopted solution.
Jaeger is a newer tool with a smaller community, but is backed by the CNCF which instills trust in the project. It provides higher speed, flexibility, and scalability, using a distributed architecture. It also supports tracing in more languages. However, it can be more complex to use and, as a less mature project, presents some risk for enterprise environments. Choose Jaeger if you are looking for a cutting edge solution and are willing to accept a few rough edges.
Lumigo is a cloud native observability tool, purpose-built to navigate the complexities of microservices. Through automated distributed tracing, Lumigo stitches together the many components of a containerized application and tracks every service in a request. When an error or failure occurs, users will see not only the impacted service, but the entire request in one visual map so you can easily understand the root cause, limit impact and prevent future failures.
With deep debugging data in to applications and infrastructure, developers have all the information they need to monitor and troubleshoot their containers with out any of the manual work: