OpenTelemetry Collector
The Collector is a backend component.
It is responsible for ingesting telemetry data, all records, traces and metrics, and all other things that will be implemented in the future. First it collects everything and then does some processing, if necessary, and finally does the export.
It's quite similar to the SDK in some parts, but it doesn't necessarily generate data, it receives data.
-
SDK
- Generates data
- Processes data
- Exports data
- Exclusive to the application
-
Collector
- Receives data
- Processes data (can do it, it's not mandatory)
- Exports data
- Can be scaled to receive multiple records from multiple sources
- Serves as a bridge between the vendor and the local infra
If all applications send data to the collector we will have a centralized configuration for where we're going to store the data. Let's imagine that today we're using Datadog, but in the future we're going to switch to Dynatrace, only changing the configuration in the collector would be necessary if all applications used the collector, otherwise it would be necessary to change the exporter of all of them.
A collector has the function of separating metric, log and trace data and sending them to different vendors.
Trace pipeline: receiver>>>>processor>>>>exporter
Metric pipeline: receiver>>>>processor>>>>exporter
Logs pipeline: receiver>>>>processor>>>>exporter
The vendor for metrics can be different from the vendor for traces and logs. We could use elastic search for trace, prometheus for metrics and grafana loki for logs.
Let's analyze what we had until now.
The SDK of each application sends data to Jaeger and exposes /metric so that prometheus can fetch the metrics and for this we configured prometheus with the targets of these applications.
What do we want? For applications to send everything to the collector and it distributes to the right places. Prometheus will now scrape the collector.
For this we're going to configure a collector which is actually a simple yaml configuration.
Let's add the collector to docker-compose
##code...
collector:
image: otel/opentelemetry-collector-contrib
command:
- '--config=/etc/collector/collector.yaml'
ports:
- 8889:8889
- 4317:4317
- 4318:4318
volumes:
- ./collector:/etc/collector
depends_on:
- prometheus
- jaeger
Basically we need to expose the ports we'll use and pass the configuration file.
Now we need to define the collector configuration file.
Now let's look at this configuration file.
# Receiver configurations
receivers:
otlp: # We'll use and reference this configuration in services.
protocols:
http:
endpoint: 0.0.0.0:4318
cors:
allowed_origins: ["*"]
grpc:
endpoint: 0.0.0.0:4317
# Processor configurations
processors:
# Exporter configurations
exporters:
# Prometheus will scrape our collector instead of the application, so we need to change the configurations in it.
prometheus:
endpoint: 0.0.0.0:8889
send_timestamps: true
namespace: otel
const_labels:
via: collector
otlphttp: ## To export to jaeger using otlp with http
endpoint: "http://jaeger:4318"
tls:
insecure: true
# Extension configurations we'll talk about later
extensions:
health_check:
# Service configurations
service:
extensions:
- health_check
pipelines: # We'll have two pipelines in our scenario
# Each of these pipelines has its receivers, processors and exporters
traces:
# Here we're saying how we're going to ingest data, it could be in different formats,
# with different security levels, with different encodings, etc
# in our case we'll use the receiver with the otlp name that we defined above which accepts http and grpc
receivers: # we can ingest data from multiple places that's why it's an array
- otlp
processors:
# we could export to multiple places at the same time in our case only jaeger for traces
exporters:
- otlphttp
metrics:
receivers:
- otlp
processors:
# we could export to multiple places at the same time in our case only prometheus for metrics
exporters:
- prometheus
As mentioned above we need to change the Prometheus target in the prometheus.yaml file to scrape the collector instead of our applications.
global:
scrape_interval: "5s"
scrape_configs:
- job_name: 'opentelemetry'
metrics_path: /metrics
scheme: http
static_configs:
- targets:
# - todo:9464
# - auth:9464
- collector:8889
It will also be necessary to change the sdk configuration of our application. The new code will be in the collector branch including all the files mentioned here.
In our instrumentation.ts we're going to make the following changes
import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-proto";
//Code....
function start(serviceName: string) {
//Code....
// WE WILL NO LONGER EXPOSE DATA ON AN ENDPOINT, WE NEED TO PUSH DATA TO THE COLLECTOR.
// const prometheusExporter = new PrometheusExporter(
// {
// port: PrometheusExporter.DEFAULT_OPTIONS.port,
// endpoint: PrometheusExporter.DEFAULT_OPTIONS.endpoint,
// },
// () => {
// console.log(
// `prometheus scrape endpoint: http://localhost:${PrometheusExporter.DEFAULT_OPTIONS.port}${PrometheusExporter.DEFAULT_OPTIONS.endpoint}`
// );
// }
// );
// WE WILL NOW USE OTLPMetricExporter
const metricReader = new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: "http://collector:4318/v1/metrics",
}),
});
const meterProvider = new MeterProvider({
resource,
// readers: [prometheusExporter],
readers: [metricReader],
});
const traceExporter = new OTLPTraceExporter({
// url: "http://jaeger:4318/v1/traces",
// Instead of sending to jaeger we'll send to the collector.
url: "http://collector:4318/v1/traces",
});
//Code....
}
That's it... if you check you'll see that we have everything working. Run the compose again, check jaeger and prometheus with the metrics.