Context Propagation
If we analyze what the flow of a span is until it reaches its destination, we have this.
The instrumentation doesn't affect the application much, as the CPU and memory cost is very low. Everything is done very quickly and there isn't much we can improve, except to avoid creating unwanted spans to generate costs as we mentioned before.
At the other end we have the backend which is an external service with completely separate resources, not interfering with our application at all.
The exporter is responsible for transforming the spans into the appropriate format and coordinating their sending to the backend, being responsible for the network activity of the process. It can work with individual spans or in batches.
The processing (Span Processor) is an important component in the OpenTelemetry flow that acts as an intermediary between instrumentation (where spans are created) and the exporter (which sends data to the backend). It manages the lifecycle of spans. The Span Processor is crucial to ensure performance and reliability in the processing of telemetry data, acting as an intelligent buffer between data generation and sending. It operates entirely in memory (without disk persistence) and maintains a queue to store spans. The size of this queue can be configurable, however, if the queue is full, new spans will be discarded.
So we have:
- Instrumentation
- Very low impact, as they are basically lightweight interceptions
- Uses minimal resources since it only collects data
- The main overhead comes from creating spans and their attributes
- Generally doesn't block the main application flow
- Processing (SpanProcessor)
- Moderate impact, as it deals with spans in memory
- BatchSpanProcessor can consume more memory due to the buffer
- There may be contention if the queue gets full
- Processing is asynchronous, so it doesn't block the main thread
- Exporter
- Moderate to high impact depending on configuration
- Sending data over the network is the "heaviest" operation
- There may be contention if the backend is slow
- Network problems can affect span processing
- Backend
- Minimal impact on the application, as it's a separate system
- The only interference would be if it starts rejecting data
- Backend latency issues can affect the exporter
- Generally doesn't affect application performance directly
About processing we have two types.
SimpleSpanProcessor: doesn't maintain a queue, each span is processed immediately, uses less memory but is less efficient.BatchSpanProcessor: a buffer and queue for processing is created and is what we should use in production.
Knowing this, we can understand that if the application stops, everything that was in memory will also be lost and memory consumption needs to be monitored. In case of not being able to deliver to the backend, memory can grow rapidly.
Simple vs Batch​
BatchSpanProcessor is significantly more efficient than SimpleSpanProcessor, especially in high-load scenarios.
| Characteristic | SimpleSpanProcessor | BatchSpanProcessor | Observations |
|---|---|---|---|
| Processing Pattern | Processes and exports each span individually | Groups spans into batches before exporting | BatchSpanProcessor significantly reduces network overhead |
| Throughput | ~100-1000 spans/second | ~10000-100000 spans/second | BatchSpanProcessor can be 100x more efficient |
| Network Calls | One call per span | One call per batch (e.g., 512 spans) | Up to 99.8% reduction in network calls with BatchSpanProcessor |
| Latency | 100ms per span (assuming 100ms network latency) | ~0.2ms per span (100ms/512 spans) | BatchSpanProcessor distributes latency cost across all batch spans |
| CPU Usage | High (overhead per span) | Low (shared overhead) | Up to 90% reduction in CPU overhead with BatchSpanProcessor |
| Memory Usage | Low (no buffer) | Moderate (configurable buffer) | BatchSpanProcessor uses more memory but is controllable via maxQueueSize |
| Back Pressure | Doesn't have | Has (via maxQueueSize) | BatchSpanProcessor can control system overload |
| Use Case | Development and Debug | Production | SimpleSpanProcessor is not recommended for production |
| Reliability | Lower (more susceptible to network failures) | Higher (better fault tolerance) | BatchSpanProcessor has retry and buffer |
| Configurability | Minimal | High (batch size, queue size, delay, etc) | BatchSpanProcessor offers more control |
| Data Loss | Only loses the current span being processed | Loses the entire Buffer (due to buffer) | BatchSpanProcessor can preserve spans in memory |
| Debug | Easier (straightforward behavior) |
Check in the created SDK which type of processor is used and change if necessary to batch. It's interesting to configure some variables that define its limits.
We can configure by passing environment variables or defining via code.
-
OTEL_BSP_MAX_EXPORT_BATCH_SIZE (Default 512): The maximum number of spans that can be sent to the backend at once. It's good for it to be large, but not too large because the download and upload time will increase according to size. In this case it will send up to 512 spans at a time.
-
OTEL_BSP_MAX_QUEUE_SIZE (Default 2048):
- It's the maximum number of spans in the queue. If it exceeds 2048 it will discard.
-
OTEL_BSP_SCHEDULE_DELAY (Default 5000, 5s)
- Every 5 seconds the processor tries to export the accumulated spans. If there are no spans in the buffer, nothing is sent. If there are spans, it sends up to the limit defined in OTEL_BSP_MAX_EXPORT_BATCH_SIZE.
-
OTEL_BSP_EXPORT_TIMEOUT (Default 30000, 30s):
- The exporter sends the spans to the backend and waits up to X milliseconds (e.g., 30000ms = 30 seconds) for a response.
- If the backend doesn't respond in that time the operation is considered failed and the spans can be discarded or tried again depending on the configuration.