Installation

Strimzi Operators

All configuration is done in an operator (controller) that takes care of managing and deploying through the custom resources it provides.

There's more than one operator that we'll install little by little.

Let's install the operator via helm.

helm repo add strimzi https://strimzi.io/charts/
helm repo update
## If you want to see the complete values.yaml
helm show values strimzi/strimzi-kafka-operator

Let's define two replicas in this values file, with one always needing to be active to ensure high availability. Create the values.yaml below.

#values.yaml
replicas: 2

# The default version of images to use.
# That is, the strimzi version
defaultImageTag: 0.45.0

leaderElection:
  enable: true

podDisruptionBudget:
  enabled: true
  minAvailable: 1
  maxUnavailable:

Important to mention that we didn't define which namespace this operator will observe. By default, it will observe the kafka namespace, actually the same one where it's installed. If it needs to observe resources created in other namespaces, it should be defined in the values.

Let's apply it.

helm install strimzi strimzi/strimzi-kafka-operator --values values.yaml --namespace kafka --create-namespace

k get deploy -n kafka
NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
strimzi-cluster-operator   2/2     2            2           3m30s

kubectl get pods -n kafka

NAME                                        READY   STATUS    RESTARTS   AGE
strimzi-cluster-operator-6bf566db79-kwbk6   1/1     Running   0          108s
strimzi-cluster-operator-6bf566db79-qft2r   1/1     Running   0          108s

# If you want to check the logs
kubectl logs deployment/strimzi-cluster-operator -n kafka -f

The Operator will provide us with several custom resources that we'll use to configure our cluster and manage it through Kubernetes manifests.

This already avoids Crossplane needing to extend the API and doesn't need a provider either, which would greatly facilitate integration with IDPs.

An overview of existing custom resources so far.

kubectl api-resources | grep strimzi
strimzipodsets                              sps                core.strimzi.io/v1beta2                         true         StrimziPodSet
kafkabridges                                kb                 kafka.strimzi.io/v1beta2                        true         KafkaBridge
kafkaconnectors                             kctr               kafka.strimzi.io/v1beta2                        true         KafkaConnector
kafkaconnects                               kc                 kafka.strimzi.io/v1beta2                        true         KafkaConnect
kafkamirrormaker2s                          kmm2               kafka.strimzi.io/v1beta2                        true         KafkaMirrorMaker2
kafkamirrormakers                           kmm                kafka.strimzi.io/v1beta2                        true         KafkaMirrorMaker
kafkanodepools                              knp                kafka.strimzi.io/v1beta2                        true         KafkaNodePool
kafkarebalances                             kr                 kafka.strimzi.io/v1beta2                        true         KafkaRebalance
kafkas                                      k                  kafka.strimzi.io/v1beta2                        true         Kafka
kafkatopics                                 kt                 kafka.strimzi.io/v1beta2                        true         KafkaTopic
kafkausers                                  ku                 kafka.strimzi.io/v1beta2                        true         KafkaUser

Study the Strimzi examples which have a lot of good stuff and we can configure the cluster little by little.

There are several different operators that are different controllers for different purposes in the cluster.

I created a project on github with all the manifests we'll apply along the way.

Since it's a test cluster, I'll dimension it with a small volume, only 5GB. If creating in production, adjust accordingly.

First, let's talk about storage types as it's very important to clarify this.

In the context of Strimzi Kafka Operator and Kubernetes, the differences between Ephemeral and JBOD (Just a Bunch Of Disks) storage types are quite important.

Ephemeral Storage: refers to a type of temporary storage that's associated with pods while they're running. When the pod is destroyed or restarted, all data stored in this volume is lost.
- Data is not persistent between pod restarts.
- Uses local storage infrastructure of nodes, which can be faster and cheaper, but doesn't offer durability.
- Ideal for temporary data like caches, logs, or intermediate data that doesn't need to be persisted between restarts.
- Can be useful in development or test environments where data durability isn't a priority.
JBOD (Just a Bunch of Disks): Refers to a storage configuration type where you can add multiple physical disks (or volumes) to the Kafka pod. Each disk can be used to store part of the Kafka message log.
- Uses persistent volumes (Persistent Volumes - PVs), meaning data remains stored even if the Kafka pod is restarted.
- Provides greater security and guarantee of stored data durability.
- Can add more volumes as needed to distribute Kafka's storage load. It's possible to have multiple volumes connected to Kafka, which increases storage capacity.
- Is the recommended choice when you need to ensure data durability, as volumes persist even after pod failures or restarts.

A node can be broker type, control, or both. Since we'll use Kraft, it will be both.

Notice we're using custom resources created by Strimzi. But this is just a definition for what we expect from the nodes.

apiVersion: kafka.strimzi.io/v1beta2 ## Custom Resource!
kind: KafkaNodePool
metadata:
  name: broker
  labels:
    strimzi.io/cluster: kafka-cluster # Which cluster it will belong to (We'll define this name in the cluster manifest later)
  namespace: kafka # The operator acts on this namespace
spec:
  replicas: 3 # Number of nodes we'll use.
  roles: # Will have both functions
    - controller
    - broker
  storage:
    type: jbod # storage type
    volumes:
      - id: 0
        type: persistent-claim
        size: 5Gi # Since it's a simple study cluster locally, we don't need to create very large storages, but in production it's not sufficient.
        kraftMetadata: shared
        deleteClaim: false
  # This resource is very low for a kafka cluster, this is just for playing locally creating some topics
  resources:
    requests:
      memory: 1Gi
      cpu: "0.3"
    limits:
      memory: 1Gi
      cpu: "0.3"
      # We'll distribute nodes among different Kubernetes physical nodes
  template:
    pod:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: strimzi.io/cluster
                    operator: In
                    values:
                      - kafka-cluster
              topologyKey: kubernetes.io/hostname

Apply the manifest above to the cluster and nothing will happen, as it will only be actually created after we define the cluster below.

I'll leave comments to better understand the parts of this custom resource.

# Basic configuration (required)
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  # Name of our cluster.
  # All resources of this cluster must have a label to inform the operator that it belongs to this cluster
  # The label that will be used is strimzi.io/cluster: kafka-cluster
  name: kafka-cluster
# Deployment specifications
  annotations:
    # Through this annotation we're informing that we want kraft and the node configurations are defined in a node-pool we made before.
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafka:
    # Here we have some ports we'll use.
    listeners:
      # Name to identify the listener. Must be unique within the Kafka cluster.
      - name: plain
        # Port number used by the listener inside Kafka. The port number has to be unique within a given Kafka cluster. Allowed port numbers are 9092 and higher with the exception of ports 9404 and 9999, which are already used for Prometheus and JMX. Depending on the listener type, the port number might not be the same as the port number that connects Kafka clients.
        port: 9092
        # Listener type specified as internal or cluster-ip (to expose Kafka using per-broker ClusterIP services), or for external listeners, as route (OpenShift only), loadbalancer, nodeport or ingress (Kubernetes only).
        type: internal
        # Enables or disables TLS encryption for each listener. For route and ingress type listeners, TLS encryption must always be enabled by setting it to true.
        tls: false
        # Defines whether the fully-qualified DNS names including the cluster service suffix (usually .cluster.local) are assigned.
        configuration:
          useServiceDnsDomain: true
      - name: tls
        port: 9093
        type: internal
        tls: true
      # - name: external
      #   port: 9094
      #   type: ingress
      #   tls: true
      #   authentication:
      #     type: scram-sha-512
      #   configuration:
      #     bootstrap:
      #       annotations:
      #         nginx.ingress.kubernetes.io/rewrite-target: "/"
      #     brokers:
      #       - broker: 0
      #         annotations:
      #           nginx.ingress.kubernetes.io/service-upstream: "true"
      #       - broker: 1
      #         annotations:
      #           nginx.ingress.kubernetes.io/service-upstream: "true"
      #       - broker: 2
      #         annotations:
      #           nginx.ingress.kubernetes.io/service-upstream: "true"
      #   ingress:
      #     # Adding annotation for NGINX controller
      #     annotations:
      #       kubernetes.io/ingress.class: "nginx"  # Defines that NGINX will be the ingress controller
      #     hosts:
      #       - kafka.localhost
      #     # tls:
      #     #   - secretName: kafka-tls  # TLS certificate, if you have one
      #     #     hosts:
      #     #       - kafka.localhost
      #     rules:
      #       - host: kafka.localhost
      #         http:
      #           paths:
      #             - path: /
      #               pathType: Prefix
      #               backend:
      #                 service:
      #                   name: kafka-cluster-kafka-bootstrap  # Kafka service to be accessed externally
      #                   port:
      #                     number: 9092  # Kafka port
        # Listener authentication mechanism specified as mTLS, SCRAM-SHA-512, or token-based OAuth 2.0.
        # authentication:
        #   type: tls
      # - name: external1
      #   port: 9094
      #   type: route
      #   tls: true
      #   configuration:
      #     brokerCertChainAndKey: # (9)
      #       secretName: my-secret
      #       certificate: my-certificate.crt
      #       key: my-key.key

    # Kafka version (recommended)
    version: "3.9.0"
    # KRaft metadata version (recommended)
    metadataVersion: "3.9"
    # Kafka configuration (recommended)
    config: # (12)
      auto.create.topics.enable: "false"
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
      default.replication.factor: 3
      min.insync.replicas: 2
    # Resources requests and limits (recommended)
    # Removed because we're using nodepool
    resources:
      requests:
        memory: 3Gi
        cpu: "3"
      limits:
        memory: 3Gi
        cpu: "3"
    # Logging configuration (optional)
    # Kafka loggers and log levels added directly (inline) or indirectly (external) through a ConfigMap. A custom Log4j configuration must be placed under the log4j.properties key in the ConfigMap. For the Kafka kafka.root.logger.level logger, you can set the log level to INFO, ERROR, WARN, TRACE, DEBUG, FATAL or OFF.
    logging: # (14)
      type: inline
      loggers:
        kafka.root.logger.level: INFO
        log4j.logger.io.strimzi: "DEBUG"
        log4j.logger.kafka: "DEBUG"
        log4j.logger.org.apache.kafka: "DEBUG"
    # Readiness probe (optional)
    readinessProbe: #
      initialDelaySeconds: 15
      timeoutSeconds: 5
    # Liveness probe (optional)
    livenessProbe:
      initialDelaySeconds: 15
      timeoutSeconds: 5

    # JVM options (optional)
    # JVM configuration options to optimize performance for the Virtual Machine (VM) running Kafka.
    # jvmOptions:
    #   -Xms: 8192m
    #   -Xmx: 8192m

    # Authorization (optional)
    # Authorization enables simple, OAUTH 2.0, or OPA authorization on the Kafka broker. Simple authorization uses the AclAuthorizer and StandardAuthorizer Kafka plugins.
    # authorization:
    #   type: simple
    # type: opa

    # Metrics configuration (optional)
    # Prometheus metrics enabled. In this example, metrics are configured for the Prometheus JMX Exporter (the default metrics exporter).
    # metricsConfig:
    #   type: jmxPrometheusExporter
    #   valueFrom:
    #     # Rules for exporting metrics in Prometheus format to a Grafana dashboard through the Prometheus JMX Exporter, which are enabled by referencing a ConfigMap containing configuration for the Prometheus JMX exporter. You can enable metrics without further configuration using a reference to a ConfigMap containing an empty file under metricsConfig.valueFrom.configMapKeyRef.key.
    #     configMapKeyRef:
    #       name: kafka-metrics
    #       key: kafka-kraft-metrics-config.yml
  # Entity Operator (recommended)
  # Entity Operator configuration, which specifies the configuration for the Topic Operator and User Operator.
  entityOperator:
    topicOperator:
      # Have a namespace just for kafka items or use the same?
      watchedNamespace: kafka
      reconciliationIntervalMs: 60000
      # Resources requests and limits (recommended)
      resources:
        requests:
          memory: 512Mi
          cpu: "1"
        limits:
          memory: 512Mi
          cpu: "1"
      # Logging configuration (optional)
      # Specified Topic Operator loggers and log levels. This example uses inline logging.
      logging: # (23)
        type: inline
        loggers:
          rootLogger.level: INFO
    userOperator:
      watchedNamespace: kafka
      reconciliationIntervalMs: 60000
      # Resources requests and limits (recommended)
      resources:
        requests:
          memory: 512Mi
          cpu: "1"
        limits:
          memory: 512Mi
          cpu: "1"
      logging: # (24)
        type: inline
        loggers:
          rootLogger.level: INFO
  # Kafka Exporter (optional)
  kafkaExporter: {} # (25)
    # ...
  # Cruise Control (optional)
  cruiseControl: {}
    # ...

Create the manifest above and apply it to the cluster. The kafka namespace is already defined as created earlier. The operator will understand what to do and apply everything.

We'll also explore cruiseControl which was defined in the manifest but doesn't have its rules defined yet for rebalancing. Several examples of how to use cruise control are available, we'll use the most generic one for now which works well. Things should be adapted according to the proposal and size of your cluster. It's not mandatory, but it's very welcome.

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
  name: rebalance
  namespace: kafka
  labels:
    strimzi.io/cluster: kafka-cluster
  annotations:
    strimzi.io/rebalance-auto-approval: "true"
spec:
  goals:
    - CpuCapacityGoal
    - NetworkInboundCapacityGoal
    - DiskCapacityGoal
    - RackAwareGoal
    - MinTopicLeadersPerBrokerGoal
    - NetworkOutboundCapacityGoal
    - ReplicaCapacityGoal

Also apply the manifest above if you want and let's see what we have.

❯ kubectl get kafkanodepools.kafka.strimzi.io -n kafka
NAME     DESIRED REPLICAS   ROLES                     NODEIDS
broker   3                  ["controller","broker"]   [0,1,2]


❯ kubectl get kafka -n kafka
NAME            DESIRED KAFKA REPLICAS   DESIRED ZK REPLICAS   READY   METADATA STATE   WARNINGS
kafka-cluster                                                  True    KRaft


❯ kubectl get deploy -n kafka
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
kafka-cluster-cruise-control    1/1     1            1           5d1h
kafka-cluster-entity-operator   1/1     1            1           5d22h
kafka-cluster-kafka-exporter    1/1     1            1           5d
strimzi-cluster-operator        1/1     1            1           7d19h

❯ kubectl get pods -n kafka
NAME                                             READY   STATUS    RESTARTS   AGE
kafka-cluster-broker-0                           1/1     Running   0          4h10m
kafka-cluster-broker-1                           1/1     Running   0          4h10m
kafka-cluster-broker-2                           1/1     Running   0          4h10m
kafka-cluster-cruise-control-5f87855476-jvf9c    1/1     Running   0          4h6m
kafka-cluster-entity-operator-794c776cb7-dd44n   2/2     Running   0          4h6m
kafka-cluster-kafka-exporter-6dcfc78f98-tlhfj    1/1     Running   0          4h6m
strimzi-cluster-operator-6bf566db79-rtwrj        1/1     Running   0          4h8m

Strimzi Operators​

Strimzi Operators