Skip to main content

How Tracing Works

It's really worth understanding how OTel manages to build traces.

We have something happening in one process and something else happening in another process, but how do you think this is happening?

We have the following scenario: app A and app B both instrumented with OTel and A makes an HTTP request to B.

alt text

The instrumentation sends data to Jaeger which will be our OTel backend.

alt text

When these services are reporting about tracing, they need to know some things, the main one being the Trace ID.

The Trace ID is shared among all spans so we know who is who, who relates to whom, and the time spent.

Each span needs to have a parent ID; if it doesn't have one, it will be the root of the trace.

Basically, when application B is reporting to Jaeger, it needs to report:

  • Its Span ID
  • Parent Span ID (which was created in app A)
  • Trace ID (which was created in app A)

This means that when we're sending the HTTP call from App A to App B, we need to provide the telemetry context.

This is done by injecting into the HTTP call header. App B will extract the context from the call header that was automatically injected.

alt text

So does OTel only support HTTP? Of course not. It supports various forms of communication and this was just an example. We could have used Kafka, SQS, RabbitMQ, gRPC, Socket.io, etc.

alt text

SpanIDs are generated automatically.

Actually, we're always creating spans that are shown by tools like Jaeger through a dependency tree using values like traceID, parentId, spanID, etc.

We can use context to carry more information between distributed applications. Instead of repeating value passing call by call, for example a userID, we can use the idea of Baggage. If you're a backend developer, it's worth checking this out. I won't go deep into this as it's more associated with development, but know that it exists.

In OpenTelemetry, Baggage is contextual information that resides alongside context. Baggage is a key-value storage, which means it allows you to propagate any data you want along with the context.