Skip to content

akuldevali/kubeflow

Repository files navigation

Kubeflow Pipelines (KFP) Demo Project

This repository contains two sample Kubeflow Pipelines:

  • hello_pipeline.py: a minimal "Hello World" pipeline
  • iris_pipeline.py: a simple ML pipeline on the Iris dataset (load data, train model, report accuracy)

Precompiled pipeline specs are included:

  • hello_world_pipeline.yaml
  • iris_pipeline.yaml

What is KFP?

Kubeflow Pipelines (KFP) is a platform for building and running portable, reproducible ML workflows on Kubernetes.

With KFP, you can:

  • Define pipelines in Python using the KFP SDK
  • Compile them into YAML pipeline specifications
  • Run them on a Kubeflow Pipelines backend
  • Track run history, logs, and parameters in the UI

In this project, each Python script defines components and a pipeline, then compiles to a YAML file that Kubeflow can execute.

Project Structure

hello_pipeline.py          # Hello World pipeline definition
hello_world_pipeline.yaml  # Compiled Hello World pipeline spec
iris_pipeline.py           # Iris ML pipeline definition
iris_pipeline.yaml         # Compiled Iris pipeline spec
requirements.txt           # Python dependencies

Prerequisites

  • Python 3.10+ recommended
  • Access to a running Kubeflow Pipelines environment (Kubeflow on Kubernetes)
  • kubectl configured for your cluster (for cluster checks/port-forwarding)

Setup

1. Create and activate a virtual environment (PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1

2. Install dependencies

pip install -r requirements.txt

Compile the Pipelines

Run the Python scripts to generate/update YAML specs:

python hello_pipeline.py
python iris_pipeline.py

This will produce:

  • hello_world_pipeline.yaml
  • iris_pipeline.yaml

Execute Pipelines in Kubeflow

You can run pipelines either from the Kubeflow UI or with the KFP Python client.

Option A: Run from Kubeflow UI

  1. Open Kubeflow Pipelines UI.
  2. Upload one of the YAML files (hello_world_pipeline.yaml or iris_pipeline.yaml).
  3. Create a run from the uploaded pipeline.
  4. (Hello pipeline) optionally set recipient parameter.

Option B: Run with KFP SDK client

Use kfp.Client against your KFP endpoint and submit the compiled YAML:

from kfp import Client

client = Client(host="http://<YOUR-KFP-ENDPOINT>")

run = client.create_run_from_pipeline_package(
    pipeline_file="hello_world_pipeline.yaml",
    arguments={"recipient": "Kubeflow"},
    run_name="hello-world-run",
)

print(run)

For the Iris pipeline, submit iris_pipeline.yaml (no runtime arguments required in current version).

Notes on the Included Pipelines

Hello World pipeline

  • Single component: say_hello(name: str) -> str
  • Pipeline name: hello-world-pipeline
  • Default parameter: recipient="World"

Iris pipeline

  • load_data: loads Iris features/labels
  • train_model: trains RandomForestClassifier and prints accuracy
  • Pipeline name: iris-no-artifacts-pipeline
  • Uses lightweight component containers (python:3.10-slim) and installs required packages per component

Troubleshooting

  • If YAML is not updated, rerun python hello_pipeline.py / python iris_pipeline.py after code changes.
  • If pipeline submission fails, verify KFP endpoint URL and cluster connectivity.
  • If Kubeflow services are not healthy, check pods (example):
kubectl get pod -n kubeflow

Next Improvements

  • Add explicit pipeline parameters for the Iris model (e.g., n_estimators, test_size)
  • Add model/artifact tracking instead of in-memory list passing
  • Add CI to validate pipeline compilation on each commit

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages