Context Propagation

If we analyze what the flow of a span is until it reaches its destination, we have this.

alt text

The instrumentation doesn't affect the application much, as the CPU and memory cost is very low. Everything is done very quickly and there isn't much we can improve, except to avoid creating unwanted spans to generate costs as we mentioned before.

At the other end we have the backend which is an external service with completely separate resources, not interfering with our application at all.

The exporter is responsible for transforming the spans into the appropriate format and coordinating their sending to the backend, being responsible for the network activity of the process. It can work with individual spans or in batches.

The processing (Span Processor) is an important component in the OpenTelemetry flow that acts as an intermediary between instrumentation (where spans are created) and the exporter (which sends data to the backend). It manages the lifecycle of spans. The Span Processor is crucial to ensure performance and reliability in the processing of telemetry data, acting as an intelligent buffer between data generation and sending. It operates entirely in memory (without disk persistence) and maintains a queue to store spans. The size of this queue can be configurable, however, if the queue is full, new spans will be discarded.

So we have:

Instrumentation
- Very low impact, as they are basically lightweight interceptions
- Uses minimal resources since it only collects data
- The main overhead comes from creating spans and their attributes
- Generally doesn't block the main application flow
Processing (SpanProcessor)
- Moderate impact, as it deals with spans in memory
- BatchSpanProcessor can consume more memory due to the buffer
- There may be contention if the queue gets full
- Processing is asynchronous, so it doesn't block the main thread
Exporter
- Moderate to high impact depending on configuration
- Sending data over the network is the "heaviest" operation
- There may be contention if the backend is slow
- Network problems can affect span processing
Backend
- Minimal impact on the application, as it's a separate system
- The only interference would be if it starts rejecting data
- Backend latency issues can affect the exporter
- Generally doesn't affect application performance directly

About processing we have two types.

SimpleSpanProcessor: doesn't maintain a queue, each span is processed immediately, uses less memory but is less efficient.
BatchSpanProcessor: a buffer and queue for processing is created and is what we should use in production.

Knowing this, we can understand that if the application stops, everything that was in memory will also be lost and memory consumption needs to be monitored. In case of not being able to deliver to the backend, memory can grow rapidly.

Simple vs Batch

BatchSpanProcessor is significantly more efficient than SimpleSpanProcessor, especially in high-load scenarios.

Characteristic	SimpleSpanProcessor	BatchSpanProcessor	Observations
Processing Pattern	Processes and exports each span individually	Groups spans into batches before exporting	BatchSpanProcessor significantly reduces network overhead
Throughput	~100-1000 spans/second	~10000-100000 spans/second	BatchSpanProcessor can be 100x more efficient
Network Calls	One call per span	One call per batch (e.g., 512 spans)	Up to 99.8% reduction in network calls with BatchSpanProcessor
Latency	100ms per span (assuming 100ms network latency)	~0.2ms per span (100ms/512 spans)	BatchSpanProcessor distributes latency cost across all batch spans
CPU Usage	High (overhead per span)	Low (shared overhead)	Up to 90% reduction in CPU overhead with BatchSpanProcessor
Memory Usage	Low (no buffer)	Moderate (configurable buffer)	BatchSpanProcessor uses more memory but is controllable via maxQueueSize
Back Pressure	Doesn't have	Has (via maxQueueSize)	BatchSpanProcessor can control system overload
Use Case	Development and Debug	Production	SimpleSpanProcessor is not recommended for production
Reliability	Lower (more susceptible to network failures)	Higher (better fault tolerance)	BatchSpanProcessor has retry and buffer
Configurability	Minimal	High (batch size, queue size, delay, etc)	BatchSpanProcessor offers more control
Data Loss	Only loses the current span being processed	Loses the entire Buffer (due to buffer)	BatchSpanProcessor can preserve spans in memory
Debug	Easier (straightforward behavior)

Check in the created SDK which type of processor is used and change if necessary to batch. It's interesting to configure some variables that define its limits.

We can configure by passing environment variables or defining via code.

OTEL_BSP_MAX_EXPORT_BATCH_SIZE (Default 512): The maximum number of spans that can be sent to the backend at once. It's good for it to be large, but not too large because the download and upload time will increase according to size. In this case it will send up to 512 spans at a time.
OTEL_BSP_MAX_QUEUE_SIZE (Default 2048):
- It's the maximum number of spans in the queue. If it exceeds 2048 it will discard.
OTEL_BSP_SCHEDULE_DELAY (Default 5000, 5s)
- Every 5 seconds the processor tries to export the accumulated spans. If there are no spans in the buffer, nothing is sent. If there are spans, it sends up to the limit defined in OTEL_BSP_MAX_EXPORT_BATCH_SIZE.
OTEL_BSP_EXPORT_TIMEOUT (Default 30000, 30s):
- The exporter sends the spans to the backend and waits up to X milliseconds (e.g., 30000ms = 30 seconds) for a response.
- If the backend doesn't respond in that time the operation is considered failed and the spans can be discarded or tried again depending on the configuration.

Simple vs Batch​

Simple vs Batch