Installation
Strimzi Operators
All configuration is done in an operator (controller) that takes care of managing and deploying through the custom resources it provides.
There's more than one operator that we'll install little by little.
Let's install the operator via helm.
helm repo add strimzi https://strimzi.io/charts/
helm repo update
## If you want to see the complete values.yaml
helm show values strimzi/strimzi-kafka-operator
Let's define two replicas in this values file, with one always needing to be active to ensure high availability. Create the values.yaml below.
#values.yaml
replicas: 2
# The default version of images to use.
# That is, the strimzi version
defaultImageTag: 0.45.0
leaderElection:
enable: true
podDisruptionBudget:
enabled: true
minAvailable: 1
maxUnavailable:
Important to mention that we didn't define which namespace this operator will observe. By default, it will observe the kafka namespace, actually the same one where it's installed. If it needs to observe resources created in other namespaces, it should be defined in the values.
Let's apply it.
helm install strimzi strimzi/strimzi-kafka-operator --values values.yaml --namespace kafka --create-namespace
k get deploy -n kafka
NAME READY UP-TO-DATE AVAILABLE AGE
strimzi-cluster-operator 2/2 2 2 3m30s
kubectl get pods -n kafka
NAME READY STATUS RESTARTS AGE
strimzi-cluster-operator-6bf566db79-kwbk6 1/1 Running 0 108s
strimzi-cluster-operator-6bf566db79-qft2r 1/1 Running 0 108s
# If you want to check the logs
kubectl logs deployment/strimzi-cluster-operator -n kafka -f
The Operator will provide us with several custom resources that we'll use to configure our cluster and manage it through Kubernetes manifests.
This already avoids Crossplane needing to extend the API and doesn't need a provider either, which would greatly facilitate integration with IDPs.
An overview of existing custom resources so far.
kubectl api-resources | grep strimzi
strimzipodsets sps core.strimzi.io/v1beta2 true StrimziPodSet
kafkabridges kb kafka.strimzi.io/v1beta2 true KafkaBridge
kafkaconnectors kctr kafka.strimzi.io/v1beta2 true KafkaConnector
kafkaconnects kc kafka.strimzi.io/v1beta2 true KafkaConnect
kafkamirrormaker2s kmm2 kafka.strimzi.io/v1beta2 true KafkaMirrorMaker2
kafkamirrormakers kmm kafka.strimzi.io/v1beta2 true KafkaMirrorMaker
kafkanodepools knp kafka.strimzi.io/v1beta2 true KafkaNodePool
kafkarebalances kr kafka.strimzi.io/v1beta2 true KafkaRebalance
kafkas k kafka.strimzi.io/v1beta2 true Kafka
kafkatopics kt kafka.strimzi.io/v1beta2 true KafkaTopic
kafkausers ku kafka.strimzi.io/v1beta2 true KafkaUser
Study the Strimzi examples which have a lot of good stuff and we can configure the cluster little by little.
There are several different operators that are different controllers for different purposes in the cluster.
I created a project on github with all the manifests we'll apply along the way.
Since it's a test cluster, I'll dimension it with a small volume, only 5GB. If creating in production, adjust accordingly.
First, let's talk about storage types as it's very important to clarify this.
In the context of Strimzi Kafka Operator and Kubernetes, the differences between Ephemeral and JBOD (Just a Bunch Of Disks) storage types are quite important.
-
Ephemeral Storage: refers to a type of temporary storage that's associated with pods while they're running. When the pod is destroyed or restarted, all data stored in this volume is lost.
- Data is not persistent between pod restarts.
- Uses local storage infrastructure of nodes, which can be faster and cheaper, but doesn't offer durability.
- Ideal for temporary data like caches, logs, or intermediate data that doesn't need to be persisted between restarts.
- Can be useful in development or test environments where data durability isn't a priority.
-
JBOD (Just a Bunch of Disks): Refers to a storage configuration type where you can add multiple physical disks (or volumes) to the Kafka pod. Each disk can be used to store part of the Kafka message log.
- Uses persistent volumes (Persistent Volumes - PVs), meaning data remains stored even if the Kafka pod is restarted.
- Provides greater security and guarantee of stored data durability.
- Can add more volumes as needed to distribute Kafka's storage load. It's possible to have multiple volumes connected to Kafka, which increases storage capacity.
- Is the recommended choice when you need to ensure data durability, as volumes persist even after pod failures or restarts.
A node can be broker type, control, or both. Since we'll use Kraft, it will be both.
Notice we're using custom resources created by Strimzi. But this is just a definition for what we expect from the nodes.
apiVersion: kafka.strimzi.io/v1beta2 ## Custom Resource!
kind: KafkaNodePool
metadata:
name: broker
labels:
strimzi.io/cluster: kafka-cluster # Which cluster it will belong to (We'll define this name in the cluster manifest later)
namespace: kafka # The operator acts on this namespace
spec:
replicas: 3 # Number of nodes we'll use.
roles: # Will have both functions
- controller
- broker
storage:
type: jbod # storage type
volumes:
- id: 0
type: persistent-claim
size: 5Gi # Since it's a simple study cluster locally, we don't need to create very large storages, but in production it's not sufficient.
kraftMetadata: shared
deleteClaim: false
# This resource is very low for a kafka cluster, this is just for playing locally creating some topics
resources:
requests:
memory: 1Gi
cpu: "0.3"
limits:
memory: 1Gi
cpu: "0.3"
# We'll distribute nodes among different Kubernetes physical nodes
template:
pod:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/cluster
operator: In
values:
- kafka-cluster
topologyKey: kubernetes.io/hostname
Apply the manifest above to the cluster and nothing will happen, as it will only be actually created after we define the cluster below.
I'll leave comments to better understand the parts of this custom resource.
# Basic configuration (required)
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
# Name of our cluster.
# All resources of this cluster must have a label to inform the operator that it belongs to this cluster
# The label that will be used is strimzi.io/cluster: kafka-cluster
name: kafka-cluster
# Deployment specifications
annotations:
# Through this annotation we're informing that we want kraft and the node configurations are defined in a node-pool we made before.
strimzi.io/node-pools: enabled
strimzi.io/kraft: enabled
spec:
kafka:
# Here we have some ports we'll use.
listeners:
# Name to identify the listener. Must be unique within the Kafka cluster.
- name: plain
# Port number used by the listener inside Kafka. The port number has to be unique within a given Kafka cluster. Allowed port numbers are 9092 and higher with the exception of ports 9404 and 9999, which are already used for Prometheus and JMX. Depending on the listener type, the port number might not be the same as the port number that connects Kafka clients.
port: 9092
# Listener type specified as internal or cluster-ip (to expose Kafka using per-broker ClusterIP services), or for external listeners, as route (OpenShift only), loadbalancer, nodeport or ingress (Kubernetes only).
type: internal
# Enables or disables TLS encryption for each listener. For route and ingress type listeners, TLS encryption must always be enabled by setting it to true.
tls: false
# Defines whether the fully-qualified DNS names including the cluster service suffix (usually .cluster.local) are assigned.
configuration:
useServiceDnsDomain: true
- name: tls
port: 9093
type: internal
tls: true
# - name: external
# port: 9094
# type: ingress
# tls: true
# authentication:
# type: scram-sha-512
# configuration:
# bootstrap:
# annotations:
# nginx.ingress.kubernetes.io/rewrite-target: "/"
# brokers:
# - broker: 0
# annotations:
# nginx.ingress.kubernetes.io/service-upstream: "true"
# - broker: 1
# annotations:
# nginx.ingress.kubernetes.io/service-upstream: "true"
# - broker: 2
# annotations:
# nginx.ingress.kubernetes.io/service-upstream: "true"
# ingress:
# # Adding annotation for NGINX controller
# annotations:
# kubernetes.io/ingress.class: "nginx" # Defines that NGINX will be the ingress controller
# hosts:
# - kafka.localhost
# # tls:
# # - secretName: kafka-tls # TLS certificate, if you have one
# # hosts:
# # - kafka.localhost
# rules:
# - host: kafka.localhost
# http:
# paths:
# - path: /
# pathType: Prefix
# backend:
# service:
# name: kafka-cluster-kafka-bootstrap # Kafka service to be accessed externally
# port:
# number: 9092 # Kafka port
# Listener authentication mechanism specified as mTLS, SCRAM-SHA-512, or token-based OAuth 2.0.
# authentication:
# type: tls
# - name: external1
# port: 9094
# type: route
# tls: true
# configuration:
# brokerCertChainAndKey: # (9)
# secretName: my-secret
# certificate: my-certificate.crt
# key: my-key.key
# Kafka version (recommended)
version: "3.9.0"
# KRaft metadata version (recommended)
metadataVersion: "3.9"
# Kafka configuration (recommended)
config: # (12)
auto.create.topics.enable: "false"
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
# Resources requests and limits (recommended)
# Removed because we're using nodepool
resources:
requests:
memory: 3Gi
cpu: "3"
limits:
memory: 3Gi
cpu: "3"
# Logging configuration (optional)
# Kafka loggers and log levels added directly (inline) or indirectly (external) through a ConfigMap. A custom Log4j configuration must be placed under the log4j.properties key in the ConfigMap. For the Kafka kafka.root.logger.level logger, you can set the log level to INFO, ERROR, WARN, TRACE, DEBUG, FATAL or OFF.
logging: # (14)
type: inline
loggers:
kafka.root.logger.level: INFO
log4j.logger.io.strimzi: "DEBUG"
log4j.logger.kafka: "DEBUG"
log4j.logger.org.apache.kafka: "DEBUG"
# Readiness probe (optional)
readinessProbe: #
initialDelaySeconds: 15
timeoutSeconds: 5
# Liveness probe (optional)
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
# JVM options (optional)
# JVM configuration options to optimize performance for the Virtual Machine (VM) running Kafka.
# jvmOptions:
# -Xms: 8192m
# -Xmx: 8192m
# Authorization (optional)
# Authorization enables simple, OAUTH 2.0, or OPA authorization on the Kafka broker. Simple authorization uses the AclAuthorizer and StandardAuthorizer Kafka plugins.
# authorization:
# type: simple
# type: opa
# Metrics configuration (optional)
# Prometheus metrics enabled. In this example, metrics are configured for the Prometheus JMX Exporter (the default metrics exporter).
# metricsConfig:
# type: jmxPrometheusExporter
# valueFrom:
# # Rules for exporting metrics in Prometheus format to a Grafana dashboard through the Prometheus JMX Exporter, which are enabled by referencing a ConfigMap containing configuration for the Prometheus JMX exporter. You can enable metrics without further configuration using a reference to a ConfigMap containing an empty file under metricsConfig.valueFrom.configMapKeyRef.key.
# configMapKeyRef:
# name: kafka-metrics
# key: kafka-kraft-metrics-config.yml
# Entity Operator (recommended)
# Entity Operator configuration, which specifies the configuration for the Topic Operator and User Operator.
entityOperator:
topicOperator:
# Have a namespace just for kafka items or use the same?
watchedNamespace: kafka
reconciliationIntervalMs: 60000
# Resources requests and limits (recommended)
resources:
requests:
memory: 512Mi
cpu: "1"
limits:
memory: 512Mi
cpu: "1"
# Logging configuration (optional)
# Specified Topic Operator loggers and log levels. This example uses inline logging.
logging: # (23)
type: inline
loggers:
rootLogger.level: INFO
userOperator:
watchedNamespace: kafka
reconciliationIntervalMs: 60000
# Resources requests and limits (recommended)
resources:
requests:
memory: 512Mi
cpu: "1"
limits:
memory: 512Mi
cpu: "1"
logging: # (24)
type: inline
loggers:
rootLogger.level: INFO
# Kafka Exporter (optional)
kafkaExporter: {} # (25)
# ...
# Cruise Control (optional)
cruiseControl: {}
# ...
Create the manifest above and apply it to the cluster. The kafka namespace is already defined as created earlier. The operator will understand what to do and apply everything.
We'll also explore cruiseControl which was defined in the manifest but doesn't have its rules defined yet for rebalancing. Several examples of how to use cruise control are available, we'll use the most generic one for now which works well. Things should be adapted according to the proposal and size of your cluster. It's not mandatory, but it's very welcome.
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
name: rebalance
namespace: kafka
labels:
strimzi.io/cluster: kafka-cluster
annotations:
strimzi.io/rebalance-auto-approval: "true"
spec:
goals:
- CpuCapacityGoal
- NetworkInboundCapacityGoal
- DiskCapacityGoal
- RackAwareGoal
- MinTopicLeadersPerBrokerGoal
- NetworkOutboundCapacityGoal
- ReplicaCapacityGoal
Also apply the manifest above if you want and let's see what we have.
❯ kubectl get kafkanodepools.kafka.strimzi.io -n kafka
NAME DESIRED REPLICAS ROLES NODEIDS
broker 3 ["controller","broker"] [0,1,2]
❯ kubectl get kafka -n kafka
NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS READY METADATA STATE WARNINGS
kafka-cluster True KRaft
❯ kubectl get deploy -n kafka
NAME READY UP-TO-DATE AVAILABLE AGE
kafka-cluster-cruise-control 1/1 1 1 5d1h
kafka-cluster-entity-operator 1/1 1 1 5d22h
kafka-cluster-kafka-exporter 1/1 1 1 5d
strimzi-cluster-operator 1/1 1 1 7d19h
❯ kubectl get pods -n kafka
NAME READY STATUS RESTARTS AGE
kafka-cluster-broker-0 1/1 Running 0 4h10m
kafka-cluster-broker-1 1/1 Running 0 4h10m
kafka-cluster-broker-2 1/1 Running 0 4h10m
kafka-cluster-cruise-control-5f87855476-jvf9c 1/1 Running 0 4h6m
kafka-cluster-entity-operator-794c776cb7-dd44n 2/2 Running 0 4h6m
kafka-cluster-kafka-exporter-6dcfc78f98-tlhfj 1/1 Running 0 4h6m
strimzi-cluster-operator-6bf566db79-rtwrj 1/1 Running 0 4h8m