Skip to content

[Kafka] Change Kafka installation - multi namespace support#213

Merged
shay79il merged 2 commits intomlrun:developmentfrom
shay79il:CEML-492-change-kafka-installation
Feb 3, 2026
Merged

[Kafka] Change Kafka installation - multi namespace support#213
shay79il merged 2 commits intomlrun:developmentfrom
shay79il:CEML-492-change-kafka-installation

Conversation

@shay79il
Copy link
Collaborator

Migrating bitnami kafka to Strimzi Kafka operator

Add Strimzi Kafka operator configuration
Update values for Kafka deployment

JIRA

@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch 2 times, most recently from e2f8281 to db827ac Compare November 25, 2025 15:14
@GiladShapira94
Copy link
Collaborator

Looks good, can you edit the admin and the non-admin values files with the values that need to be use.
it does not need to support ingress and user that use NodePort will need to change the ports manually

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the Kafka deployment from Bitnami Kafka to the Strimzi Kafka Operator, enabling multi-namespace support and modernizing the Kafka infrastructure with KRaft mode (ZooKeeper-less operation).

Key Changes:

  • Replaced Bitnami Kafka chart dependency with Strimzi Kafka Operator (version 0.48.0)
  • Introduced new Kubernetes custom resources for Kafka deployment including KafkaNodePool, Kafka cluster, RBAC resources, and network policies
  • Configured single-node Kafka cluster with KRaft mode for simplified deployment

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
charts/mlrun-ce/values.yaml Replaced Bitnami Kafka configuration with Strimzi operator values, including storage, resources, listeners, and RBAC settings; removed unrelated minio image config
charts/mlrun-ce/templates/kafka/kafka-cluster.yaml Added Kafka custom resource definition for Strimzi operator with listener and config management
charts/mlrun-ce/templates/kafka/kafka-nodepool.yaml Added KafkaNodePool resource for KRaft-mode Kafka cluster management
charts/mlrun-ce/templates/kafka/kafka-rbac.yaml Created RBAC resources (ServiceAccount, Role, RoleBinding) for cross-namespace Kafka access
charts/mlrun-ce/templates/kafka/kafka-network-policy.yaml Added NetworkPolicy to control egress traffic to Kafka cluster across namespaces
charts/mlrun-ce/templates/kafka/kafka-bootstrap-alias.yaml Created service alias for simplified Kafka bootstrap server naming
charts/mlrun-ce/requirements.yaml Updated chart dependency from bitnami/kafka to strimzi-kafka-operator
charts/mlrun-ce/requirements.lock Updated lock file with new Strimzi operator dependency and digest
charts/mlrun-ce/Chart.yaml Bumped chart version from 0.10.0-rc5 to 0.10.0-rc6
.gitignore Added comprehensive .DS_Store file patterns for macOS

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch 2 times, most recently from f7d0d5d to 2ad98a4 Compare January 5, 2026 09:58
shay79il

This comment was marked as resolved.

@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch 4 times, most recently from 0034134 to 71e385e Compare January 17, 2026 10:32
@shay79il
Copy link
Collaborator Author

shay79il commented Jan 17, 2026

MLRun CE Install Guide

Prerequisites

1) Install ingress-nginx (optional)

If you want ingress URLs to work on k8s

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.11.3/deploy/static/provider/cloud/deploy.yaml
kubectl rollout status -n ingress-nginx deployment/ingress-nginx-controller --timeout=2m

2) Add /etc/hosts entries (per namespace)

Add these for EACH namespace you deploy (example for mlrun + docker-desktop):

sudo /bin/sh -c 'printf "\n# mlrun-hosts:mlrun:docker-desktop BEGIN\n127.0.0.1 jupyter.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 mlrun-ui.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 mlrun-api.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 nuclio.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 s3.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 minio-console.mlrun.docker-desktop.lab.iguazeng.com\n127.0.0.1 tdengine.mlrun.docker-desktop.lab.iguazeng.com\n# mlrun-hosts:mlrun:docker-desktop END\n" >> /etc/hosts'

To remove:

sudo /bin/sh -c "sed -i '' '/# mlrun-hosts:mlrun:docker-desktop BEGIN/,/# mlrun-hosts:mlrun:docker-desktop END/d' /etc/hosts"

3) Single namespace (no controller)

Note: values.yaml is the chart default. You can omit -f values.yaml if you want.

helm upgrade --install mlrun \
  -n mlrun --create-namespace \
  --insecure-skip-tls-verify \
  ce/charts/mlrun-ce \
  -f ce/charts/mlrun-ce/values.yaml \
  --set mlrun.api.ingress.enabled=true \
  --set mlrun.ui.ingress.enabled=true \
  --set jupyterNotebook.ingress.enabled=true \
  --set minio.ingress.enabled=true \
  --set minio.consoleIngress.enabled=true \
  --set nuclio.dashboard.ingress.enabled=true \
  --set tdengine.ingress.enabled=true \
  --set-string=mlrun.api.ingress.hosts[0].host=mlrun-api.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=mlrun.ui.ingress.hosts[0].host=mlrun-ui.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=jupyterNotebook.ingress.hosts[0].host=jupyter.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=nuclio.dashboard.ingress.hosts[0]=nuclio.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=minio.ingress.hosts[0]=s3.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=minio.consoleIngress.hosts[0]=minio-console.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=tdengine.ingress.hosts[0]=tdengine.mlrun.docker-desktop.lab.iguazeng.com \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].pathType=Prefix \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].pathType=Prefix \
  --set mlrun.api.ingress.ingressClassName=nginx \
  --set mlrun.ui.ingress.ingressClassName=nginx \
  --set jupyterNotebook.ingress.ingressClassName=nginx \
  --set minio.ingress.ingressClassName=nginx \
  --set minio.consoleIngress.ingressClassName=nginx \
  --set nuclio.dashboard.ingress.ingressClassName=nginx \
  --set tdengine.ingress.ingressClassName=nginx

4) Multi namespace (controller + per‑namespace)

4.1) Controller (once)

helm upgrade --install mlrun-ce-controller \
  -n controller --create-namespace \
  --insecure-skip-tls-verify \
  ce/charts/mlrun-ce \
  -f ce/charts/mlrun-ce/admin_installation_values.yaml

4.2) Namespace install (repeat per namespace)

Replace <ns> and <cluster>.

Choose one values file:

  • non_admin_cluster_ip_installation_values.yaml (ClusterIP)
  • non_admin_installation_values.yaml (NodePort)

Note: values.yaml is the chart default. You can omit -f values.yaml if you want.

helm upgrade --install <ns> \
  -n <ns> --create-namespace \
  --insecure-skip-tls-verify \
  ce/charts/mlrun-ce \
  -f ce/charts/mlrun-ce/non_admin_cluster_ip_installation_values.yaml \
  --set mlrun.api.ingress.enabled=true \
  --set mlrun.ui.ingress.enabled=true \
  --set jupyterNotebook.ingress.enabled=true \
  --set minio.ingress.enabled=true \
  --set minio.consoleIngress.enabled=true \
  --set nuclio.dashboard.ingress.enabled=true \
  --set tdengine.ingress.enabled=true \
  --set-string=mlrun.api.ingress.hosts[0].host=mlrun-api.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=mlrun.ui.ingress.hosts[0].host=mlrun-ui.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=jupyterNotebook.ingress.hosts[0].host=jupyter.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=nuclio.dashboard.ingress.hosts[0]=nuclio.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=minio.ingress.hosts[0]=s3.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=minio.consoleIngress.hosts[0]=minio-console.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=tdengine.ingress.hosts[0]=tdengine.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].pathType=Prefix \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].pathType=Prefix \
  --set mlrun.api.ingress.ingressClassName=nginx \
  --set mlrun.ui.ingress.ingressClassName=nginx \
  --set jupyterNotebook.ingress.ingressClassName=nginx \
  --set minio.ingress.ingressClassName=nginx \
  --set minio.consoleIngress.ingressClassName=nginx \
  --set nuclio.dashboard.ingress.ingressClassName=nginx \
  --set tdengine.ingress.ingressClassName=nginx

NodePort variant (manual)

If you prefer NodePort instead of ClusterIP, replace the ClusterIP values file:

helm upgrade --install <ns> \
  -n <ns> --create-namespace \
  --insecure-skip-tls-verify \
  ce/charts/mlrun-ce \

  -f ce/charts/mlrun-ce/non_admin_installation_values.yaml \
  --set mlrun.api.ingress.enabled=true \
  --set mlrun.ui.ingress.enabled=true \
  --set jupyterNotebook.ingress.enabled=true \
  --set minio.ingress.enabled=true \
  --set minio.consoleIngress.enabled=true \
  --set nuclio.dashboard.ingress.enabled=true \
  --set tdengine.ingress.enabled=true \
  --set-string=mlrun.api.ingress.hosts[0].host=mlrun-api.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=mlrun.ui.ingress.hosts[0].host=mlrun-ui.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=jupyterNotebook.ingress.hosts[0].host=jupyter.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=nuclio.dashboard.ingress.hosts[0]=nuclio.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=minio.ingress.hosts[0]=s3.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=minio.consoleIngress.hosts[0]=minio-console.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=tdengine.ingress.hosts[0]=tdengine.<ns>.<cluster>.lab.iguazeng.com \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.api.ingress.hosts[0].paths[0].pathType=Prefix \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].path=/ \
  --set-string=mlrun.ui.ingress.hosts[0].paths[0].pathType=Prefix \
  --set mlrun.api.ingress.ingressClassName=nginx \
  --set mlrun.ui.ingress.ingressClassName=nginx \
  --set jupyterNotebook.ingress.ingressClassName=nginx \
  --set minio.ingress.ingressClassName=nginx \
  --set minio.consoleIngress.ingressClassName=nginx \
  --set nuclio.dashboard.ingress.ingressClassName=nginx \
  --set tdengine.ingress.ingressClassName=nginx

Verify

kubectl get pods -n <ns>
kubectl get ing -n <ns>

@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch 5 times, most recently from 6e02bc0 to e3de3b0 Compare January 20, 2026 07:54
@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch from e3de3b0 to c816ba7 Compare January 28, 2026 12:46
@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch from c816ba7 to 6c59fc0 Compare January 29, 2026 11:52
Add Strimzi Kafka operator configuration and update values for Kafka deployment
[JIRA](https://iguazio.atlassian.net/browse/CEML-492)
- Add post-install/post-upgrade hooks to kafka-cluster.yaml
- Add post-install/post-upgrade hooks to kafka-nodepool.yaml
- Set hook-weight to 5 to ensure they run after operator installation

This fixes the 'resource mapping not found' error when deploying
Kafka CRs before the Strimzi CRDs are installed.
@shay79il shay79il force-pushed the CEML-492-change-kafka-installation branch from eede2af to 5b4ad90 Compare February 1, 2026 10:02
@shay79il shay79il merged commit 898688f into mlrun:development Feb 3, 2026
2 checks passed
@shay79il shay79il deleted the CEML-492-change-kafka-installation branch February 3, 2026 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants