Skip to main content

Context Propagation

If we analyze what the flow of a span is until it reaches its destination, we have this.

alt text

The instrumentation doesn't affect the application much, as the CPU and memory cost is very low. Everything is done very quickly and there isn't much we can improve, except to avoid creating unwanted spans to generate costs as we mentioned before.

At the other end we have the backend which is an external service with completely separate resources, not interfering with our application at all.

The exporter is responsible for transforming the spans into the appropriate format and coordinating their sending to the backend, being responsible for the network activity of the process. It can work with individual spans or in batches.

The processing (Span Processor) is an important component in the OpenTelemetry flow that acts as an intermediary between instrumentation (where spans are created) and the exporter (which sends data to the backend). It manages the lifecycle of spans. The Span Processor is crucial to ensure performance and reliability in the processing of telemetry data, acting as an intelligent buffer between data generation and sending. It operates entirely in memory (without disk persistence) and maintains a queue to store spans. The size of this queue can be configurable, however, if the queue is full, new spans will be discarded.

So we have:

  • Instrumentation
    • Very low impact, as they are basically lightweight interceptions
    • Uses minimal resources since it only collects data
    • The main overhead comes from creating spans and their attributes
    • Generally doesn't block the main application flow
  • Processing (SpanProcessor)
    • Moderate impact, as it deals with spans in memory
    • BatchSpanProcessor can consume more memory due to the buffer
    • There may be contention if the queue gets full
    • Processing is asynchronous, so it doesn't block the main thread
  • Exporter
    • Moderate to high impact depending on configuration
    • Sending data over the network is the "heaviest" operation
    • There may be contention if the backend is slow
    • Network problems can affect span processing
  • Backend
    • Minimal impact on the application, as it's a separate system
    • The only interference would be if it starts rejecting data
    • Backend latency issues can affect the exporter
    • Generally doesn't affect application performance directly

About processing we have two types.

  • SimpleSpanProcessor: doesn't maintain a queue, each span is processed immediately, uses less memory but is less efficient.
  • BatchSpanProcessor: a buffer and queue for processing is created and is what we should use in production.

Knowing this, we can understand that if the application stops, everything that was in memory will also be lost and memory consumption needs to be monitored. In case of not being able to deliver to the backend, memory can grow rapidly.

Simple vs Batch​

BatchSpanProcessor is significantly more efficient than SimpleSpanProcessor, especially in high-load scenarios.

CharacteristicSimpleSpanProcessorBatchSpanProcessorObservations
Processing PatternProcesses and exports each span individuallyGroups spans into batches before exportingBatchSpanProcessor significantly reduces network overhead
Throughput~100-1000 spans/second~10000-100000 spans/secondBatchSpanProcessor can be 100x more efficient
Network CallsOne call per spanOne call per batch (e.g., 512 spans)Up to 99.8% reduction in network calls with BatchSpanProcessor
Latency100ms per span (assuming 100ms network latency)~0.2ms per span (100ms/512 spans)BatchSpanProcessor distributes latency cost across all batch spans
CPU UsageHigh (overhead per span)Low (shared overhead)Up to 90% reduction in CPU overhead with BatchSpanProcessor
Memory UsageLow (no buffer)Moderate (configurable buffer)BatchSpanProcessor uses more memory but is controllable via maxQueueSize
Back PressureDoesn't haveHas (via maxQueueSize)BatchSpanProcessor can control system overload
Use CaseDevelopment and DebugProductionSimpleSpanProcessor is not recommended for production
ReliabilityLower (more susceptible to network failures)Higher (better fault tolerance)BatchSpanProcessor has retry and buffer
ConfigurabilityMinimalHigh (batch size, queue size, delay, etc)BatchSpanProcessor offers more control
Data LossOnly loses the current span being processedLoses the entire Buffer (due to buffer)BatchSpanProcessor can preserve spans in memory
DebugEasier (straightforward behavior)

Check in the created SDK which type of processor is used and change if necessary to batch. It's interesting to configure some variables that define its limits.

We can configure by passing environment variables or defining via code.

  • OTEL_BSP_MAX_EXPORT_BATCH_SIZE (Default 512): The maximum number of spans that can be sent to the backend at once. It's good for it to be large, but not too large because the download and upload time will increase according to size. In this case it will send up to 512 spans at a time.

  • OTEL_BSP_MAX_QUEUE_SIZE (Default 2048):

    • It's the maximum number of spans in the queue. If it exceeds 2048 it will discard.
  • OTEL_BSP_SCHEDULE_DELAY (Default 5000, 5s)

    • Every 5 seconds the processor tries to export the accumulated spans. If there are no spans in the buffer, nothing is sent. If there are spans, it sends up to the limit defined in OTEL_BSP_MAX_EXPORT_BATCH_SIZE.
  • OTEL_BSP_EXPORT_TIMEOUT (Default 30000, 30s):

    • The exporter sends the spans to the backend and waits up to X milliseconds (e.g., 30000ms = 30 seconds) for a response.
    • If the backend doesn't respond in that time the operation is considered failed and the spans can be discarded or tried again depending on the configuration.